Stanford  University 


Department  of  Computer  Science 
Stanford,  California,  94305-2140 


Terry  Winograd 

Professor  of  Computer  Science 

Teiephone;  415  723  2780 

Fax:  415  725  7411 

Internet;  winograd@cs,stanford,edu 


Sept.  1, 1994 
To  the  participants  in  the  Stanford  World-Wide-Web-Workshop 

The  enclosed  information  about  the  workshop  is  also  on  the  Web  (except  for 
the  brief  overview  from  the  Communications  of  the  ACM).   If  you  have 
access  to  a  Web  browser,  you  should  start  with  the  workshop  home  page  URL: 

ht tp : / / www-pcd . s  tanf ord . edu/ workshop /workshop . html 

If  you  have  questions  about  arrangements  for  the  workshop,  please  contact 
Carolyn  Tajnai  <tainai@cs.stanford.edu>.  For  questions  about  the  speakers 
and  content,  you  can  ask  me. 

We're  all  looking  forward  to  a  fun  and  educational  workshop.  Thanks  for 
coming. 


Sincerely, 


Terry  Winograd 


Schedule  for  Stanford  WWW  Workshop 
Stanford  Computer  Forum  -  September  20-21,  1994 
Tuesday,  Sept  20  s 

8:00  -  Continental  Breakfast 
8:45  -  Welcome  and  introduction 
9:00  -  Session  1 

•  Naming  and  resource  location  -  Larry  Masinter,  Xerox  PARC  ,  masinter@parc.xerox.com 

•  Search  and  query  protocols  -  Brewster  Kahle,  WAIS  Inc,  brewster@wais.com 

•  Spiders  and  autonomous  agents  -  Yoav  Shoham,  Stanford  Nohotics  group  - 
shoham@cs.stanford.edu 

•  Web  structure  and  meta-information  -  Terry  Winograd,  Stanford  Project  on  People, 
Computers,  and  Design  -  winograd@cs.stanford.edu 

10:30  -  Break 
11:00-  Discussions 
12:00  -  lunch 

2:00  -  Session  2 

•  Browsers  -  Dale  Dougherty,  O'Rdllv  Associates  -  dale@ora.com 

•  A  Visual  Design  Perspective  -  Ev  Shafrir,  Hew1ett.-Packard  -  shafrir@hpuid.ptp.hp.com 
» Page  Layout  and  portabiUty  -  Steve  Zilles,  Adobe  -  zilles@adobe.com 

•  Generating  virtual  pages  -  Tom  Gruber,  .Stanford  Knowledge  Svstems  Laboratory  - 
gruber@hppcs.stanford.edu 

3:30  -  Break 
4:00  -  Discussions 
5:00  -  Informal  reception 

Wednesday,  September  21 

8:30  -  Contmental  Breakfast 
9:00  -  Session  3 

•  Security  -  Allan  Schiffman,  Enterprise  Integration  Technologies  -  ams@eit.com 
»  Using  the  Internet  for  commerce  -  Mike  Genesereth,  Stanford  Logic  Group  - 
genesereth  @  cs .  stanford.edu 

•  Enhacing  the  WWW  with  co-presence  Ehud  Shapiro,  Ubique  Ltd.  -  udi@ubique.co.il 
«  On  line  communities  -  Sean  White,  Interval  Research  -  white@interval.com 

« Digital  Libraries  Hector  Garcia-Molina,  Stanford  Oatahase  Group  -  hector@cs.stanford.edu 

10:30  Break 

11:00  Discussions 

12:00  Planning  for  bird-of-a-feather  sessions  in  the  afternoon 

12:30  Lunch  break  ^ 

Wedo  p.m.:  Open  schedule  with  meeting  rooms  made 


available 

In  addition  to  these  talks,  we  will  have  dembnistrations  of  a  number  of  Web-based  systems  being 
developed  or  used  at  Stanford.  We  will  have  number  of  workstations  available  for  participants  to 
browse  the  web  and  follow  up  on  information  resources  mentioned  in  the  sessions. 


■'»  Readings  for  ^nford  WWW  Workshop 
Stanford  Computer  Forum  -  September  2O-2I9  1994 

This  information  was  prepared  for  participants  in  the  WorldWideWeb  Workshop.  It  is  intended  as  a 
starting  point  for  further  web  exploration,  and  is  not  comprehensive  or  definitive  (most  information 
on  the  Web  isn't!). 


WWW  (The  World  Wide  Web) 

Brief  executive  summary 

A  short  introduction  to  the  Web  by  Tim  Bemers-Lee,  then  at  CERN,  now  at  MIT,  the  primary 
originator  of  the  Web. 

An  ill.ustraiai  talk. 

An  onUne  seminar  by  Bemers-Lee  on  tiie  World-Wide  Web  (W3).  It  gives  an  overview  of  W3  for 
those  to  whom  it  is  new,  including  a  review  of  the  current  status  of  software,  and  then  mentions 
some  plans  for  the  future. 

.CMd6.io.Cvberspac.e 

Kevin  Hughes  from  EIT  wrote  this  guide  (last  update.  May  1994).  It  offers  a  nice  introduction  to 
the  Web's  conceptual  structure,  to  hypermedia,  to  Mosaic,  to  HTML  and  URLs  (Uniform  Resource 
Locators),  and  to  tiie  current  state  of  the  Web. 

Wt^rld  Widp  Web  FAQ  iFrequenllv  A.skfd  Questions') 

This  page  has  many  pointers  to  detailed  information,  and  the  high-level  information  that  it  provides 
is  fairly  useful.  It  includes  a  short  comparison  of  WWW,  Gopher,  and  WAIS. 

W(^r]d-WiHe  Web  Consortium:  A  Short  Prospectus 

The  W30  (WWW  Organization)  is  a  consortium  located  at  MIT,  in  collaboration  with  CERN, 
which  is  attempting  to  play  a  central  role  in  standardizing  and  promoting  the  Web,  as  tiie 
X-Consortium  at  MIT  did  with  X-Windows. 


HTML  (The  HyperText  Markup  Language) 

A  heginner^  guide  to  HTIvIL  ■^ 

This  is  a  primer  for  producing  documents  in  HTML,  the  markup  language  used  by  the  World  Wide 
Web. 


The  official  HTML  specification 

There  is  much  debate,  over  just  what  "official"  means  on  the  Web.  The  emergence  of  the  WW  W 
consortium  may  help.  For  the  moment,  files  produced  at  CERN  (such  as  this  one)  are  more  official 
than  others. 

HTML  Description 

The  abstract  for  the  current  HTML  specification  by  Dan  Connolly  (July  1994).  It  has  links  to  the 
specification  itself  (highly  detailed),  and  the  abstract  itself  is  informative. 

Technical  specs  for  HTML 

Detailed  Technical  specs  for  those  interested  in  knowing  what  is  really  going  on. 

HTML+  (  or  HTML  3,  as  it  is  now  known) 

Covers  the  important  elements  that  wiU  be  introduced  in  planned  new  versions 


SGML  (the  Standard  Generalized  Markup  Language) 

A  description  of  SGML 

SGML  is  the  basis  for  HTML  (the  Hyper  Text  Markup  Language). 

SGML  F.40 

This  article  contains  answers  to  questions  that  are  frequently  posted  to  the  news  group 
comp.text.sgml,  it  is  intended  for  newcomers  to  the  Ust  and  SGML  beginners. 


HTTP  (The  Hypertext  Transfer  Protocol) 

Hliy  tec-hnical  description 

HTTP  is  a  protocol  with  the  hghtness  and  speed  necessary  for  a  distributed  collaborative 
hypermedia  information  system.  It  is  a  generic  stateless  object-oriented  protocol,  which  may  be 
used  for  many  similar  tasks  such  as  name  servers,  and  distributed  object-oriented  systems,  by 
extending  the  commands,  or  "methods",  used.  A  feature  if  HTTP  is  the  negotiation  of  data 
representation,  allowing  systems  to  be  built  independently  of  the  development  of  new  advanced 
representations. 

Secure  HTTP 

Proposal  for  standards  for  a  version  of  HTTP  that  provides  security  through  cryptographic 
techniques.  -^ 


Searching 


riTST  -  Search  Rvervthinp 

A  metaindex  that  lets  you  get  to  a  lot  of  other  iiiidexes  of  what  is  on  the  Web. 

The  Tntemet  Computer  Index 

More  of  the  same  including  a  sraaU  sample  of  what  kind  of  stuff  is  available  for  ftp'ing  on  the  web 
broken  down  by  computer  type. 


Some  interesting  things  to  look  at 

The  Virtual  Tourist  -  A  nice  way  of  looking  at  information  from  a  "World-Wide"  perspective  and 
shows  the  use  of  image  maps. 

The  World-Wide  Web  Virtual  Library:  Subject  Catalogue  -  An  ambitious  attempt  by  Arthur  Secret  at 
Cem,  and  volunteers,  to  provide  a  hierarchical  subject  catalog  of  what  is  on  the  web. 

PLANET  EARTH  HOME  PAGE  -  Another  interesting  way  of  approaching  a  lot  of  information, 
much  of  it  graphical  and  geographical. 

Yahoo  page 

An  ad  hoc  attempt  by  some  graduate  students  at  Stanford  to  maintain  a  hierarchical  listing  of  "cool 
stuff"  on  the  net. 

GNN  Home  Page  (Global  Network  Navigator) 
The  first  web-based  commercial  magazine. 

The  InterNIC  TnfoGuide  Home  Page  -  Information  about  the  Internet  and  guides  to  it  (including  a 
server  that  helps  you  locate  people  on  the  net). 

TheWebatNexor  A  number  of  facilities:  Ust  of  Archie  gateways  in  the  Web.  Macintosh  Archive  in 
hypertext.  ALIWEB:  Archie-Like  Indexing  in  the  Web.  Search  engines  for  RFC's  and  Internet 
Drafts.  CUSI:  a  configurable  unified  search  interface.  A  local  copy  of  the  CUI W3  Catalog  (July 
1st)  A  mirror  of  frequently  used  international  files.  A  section  dedicated  to  The  Perl  Programming 
language.  A  section  on  World  Wide  Web  Robots,  Wanderers  and  Spiders.  References  to  current 
satellite  images,  etc. 

Usenet  FAQ  lists  in  hvnertext  Lists  of  the  Frequently  Asked  Questions  postings  collected  for  all  of 
the  Internet  News  groups. 


Winograd®  r.sManford.  edu 


This  page  is  http://www-pcd.  Stanford,  edu/workshop/readings.html 


World  Wide  Web  Frequently  Asked  Questions 

This  document  resides  on  the  World  Wide  Web'ori-Smsite  <URL  is 
httv:  / /sunsite.  uric,  edu/boutell/faa/www  faa.html). 

If  you  are  unfamiliar  with  the  term  "URL",  read  on  and  learn! 

Last  update:  9/2/94 

Contents 

9     1:  Recent  changes  to  the  FAQ 

«    2:  Information  about  this  document 

9    3:  Elementary  Questions 

9    3.1:  What  are  WWW,  hypertext  and  hypermedia? 
»    3.2:  What  is  a  URL? 

9    3.3:  How  does  WWW  compare  to  gopher  and  WAIS? 
9    4:  Accessing  the  Web  (User  Questions) 

9    4.1:  Introduction:  How  can  I  access  the  web?  (Even  by  email!) 
9    4.2:  Browsers  Accessible  by  Telnet 
•    4.3:  Obtaining  browsers 

9    4.3.1:  Microsoft  Windows  browsers 

9    4.3.2:  MSDOS  browsers 

»    4.3.3:  Macintosh  browsers 

9    4.3.4:  Amiga  browsers 

9    4.3.5:  NeXTStep  browsers 

9    4.3.6:  X/DecWindows  (graphical  UNIX,  VMS)  browsers 

9    4.3.7:  Text-based  Unix  and  VMS  browsers 

®    4.3.8:  Batch-mode  "browsers" 
9    4.4:  How  can  I  access  the  web  through  a  firewall? 
»    4.5:  What  is  on  the  web? 

9    4.5.1:  How  do  I  find  out  what's  new  on  the  web? 

9    4.5.2:  Where  is  the  subject  catalog  of  the  web? 

9    4.5.3:  How  can  I  search  tlirough  ALL  web  sites? 
«    4.6:  How  can  I  save  an  inline  image  to  disk? 
9    4.7:  How  can  I  get  sound  from  the  PC  speaker  with  WinMosaic? 
9    4.8: 1  have  a  Windows  PC  (or  a  Macintosh).  Why  can't  I  open  WAIS  URLs? 
9    4.9:  I'm  running  XMosaic.  Wlw  can't  I  get  external  viewers  working? 
9    4.1  n:  Hey-  I  know.  I'll  write  a  WWW-exploring  robot!  Why  not? 
9    4.11:  How  do  I  send  newsgroup  posts  in  HTML  to  my  web  client? 
«    5:  Providing  Information  to  the  Web  (Provider  Questions) 

5.1:  Introduction:  How  can  I  provide  information  to  the  web? 
9    5.2:  Obtaining  Servers 


5.2.1 

Unix  Servers 

5.2.7 

Macintosh  Servers 

5.2.3 

Windows  and  Windows  NT  Servers 

5.2.4 

MSDOS  Servers 

•  F,.7.F,:  VMS  Servers 

9    5.2.6:  Amiga  Servers 
»    5.3:  Producing  HTML  documents 

«    5.3.1:  Writing  HTML  directly 

•  5.3.2:  HTML  editors 

9    5.3.3:  Converting  ottier  formats  to  HTML 
•    5.4:  How  do  T  publicize  my  work? 
»    5.5:  Can  I  biiy  space  on  an  existing  ser\^er? 
9    5.6:  Advanced  Provider  Questions 

9    5.6.1 :  How  do  I  set  up  a  clickable  image  map? 

9    5.6.2:  How  do  I  make  a  "link"  that  doesn't  load  a  new  page? 

9    5.6.3:  Where  can  I  learn  how  to  create  fill-out  forms? 

»    5.6.3.1:  How  can  I  create  hidden  fields  in  forms  (keeping  state)? 
9    5.6.3.2:  How  can  users  email  me  through  their  browsers? 
9    5.6.4:  How  do  I  comment  an  HTML  document? 

9    5.6.5:  How  can  I  create  decent-looking  tables  and  stop  using  <PRE>...</PRE>? 
9    5.6.6:  What  is  HTML  Level  3  and  where  can  1  learn  more  about  it? 
9    5.6.7:  How  can  I  make  transparent  GIFs? 
9    5.6.8:  Which  format  is  better  for  WWW  images,  TPEG  or  GIF? 
9    5.6.9:  How  can  I  mirror  part  of  another  server? 
9    5.6.10:  How  come  mailto:  URLs  don't  work? 
9    5.6.11:  How  can  I  restrict  and  control  access  to  my  server? 
9    5.6.12:  How  can  I  keep  robots  off  my  server? 
9     6:  What  newsgroups  discuss  the  web? 
9     7: 1  want  to  know  more. 
9    8:  Credits 

1:  Recent  additions  and  changes  to  the  FAQ 

9  9/2/94:  Email  forms 

9  9/2/94:  Keeping  robots  off  your  server 

«  9/2/94:  Ouadralay  commercial-grade  Mosaic 

9  9/2/94:  New  location  of  alternate  BBEdit  tools 

9  9/2/94:  Emacs-W3  browser  works  on  the  Amiga 

e  9/2/94:  Enhanced  imagemaps  section  fURLs  for  other  editors  wanted!) 

9  9/2/94:  Big  Dummy's  Guide  is  now  EFF's  Guide 

9  9/2/94:  Fixed  location  of  Postscript  HTML  tutorial 

9  9/2/94:  Added  Mac  program  to  transparent  section 

9  9/2/94:  Enhanced  section  on  problems  with  XMosaic  external  viewers 

9  9/2/94:  Removed  references  to  obsolete  HTML+  draft 

9  Closed  all  <A  N/yvIE>  tags.  Should  make  browsers  happier. 

9  9/2/94:  Updated  location  of  WinMosaic 

9  9/2/94:  Updated  URL  of  web  space  leasing  document 

9  9/2/94:  Email  access  to  the  web 


(Up  to  Table  of  Contents) 

2:  Information  about  this  document 

This  is  an  introduction  to  the  World  Wide  Web  project,  describing  the  concepts,  software  and  access 
methods.  It  is  aimed  at  people  who  know  a  little  about  navigating  the  Internet,  but  want  to  know  more 
about  WWW  specifically.  If  you  don't  think  you  are  up  to  this  level,  try  an  introductory  Internet  book 
such  as  Ed  Krol's  "The  Whole  Internet"  or  "EFF's  Guide  to  the  Internet".  The  latter  is  available 
electronically  by  anonymous  FTP  from  ftp.eff.org  in  the  directory  pub/Net_info/EFF_Net_Guide. 

This  informational  document  is  posted  to  news.answers,  comp.infosystems.www.users, 
comp.infosystems.www.providers,  comp.infosystems.www.misc,  comp.infosvstems.gopher, 
comp.infosystems.wais  and  alt.hypertext  every  four  days  (please  allow  a  day  or  two  for  it  to 
propagate  to  your  site).  The  latest  version  is  always  available  on  the  web  as 
http.7/sunsite.unc.edu/houtell/faq/www  faq.html.  (see  the  section  titled  "Wliat  is  a  URL?"  to 
understand  what  this  means.) 

The  most  recently  posted  version  of  this  document  is  kept  on  the  news.answers  archive  on  rtfm.mit.edu 
in  /pub /usenet /news.answers /www/fag.  For  information  on  FTP,  send  e-mail  to 
maU-server@rtfm.mit.edu  with: 

send  Usenet /news . answers /finding- sources 

in  the  body  (not  subject  line)  of  your  message,  instead  of  asking  me. 

Thomas  Boutell  maintains  this  document.  Feedback  about  it  is  to  be  sent  via  e-mail  to 
boutell@netcom.com. 

In  all  cases,  regard  this  document  as  out  of  date.  Definitive  information  should  be  on  the  web,  and 
static  versions  such  as  this  should  be  considered  unreliable  at  best.  The  most  up-to-date  version  of  the 
FAQ  is  the  version  maintained  on  the  web.  Please  excuse  any  formatting  inconsistencies  in  the  posted 
version  of  this  document,  as  it  is  automatically  generated  from  the  on-line  version. 

aip  to  Table  of  Contents) 

3:  Elementary  questions 

3.1:  What  are  WWW,  hypertext  and  hypermedia? 

www  stands  for  "World  Wide  Web".  The  WWW  project,  started  by  CERN  (the  European  Laboratory 
for  Particle  Physics),  seeks  to  build  a  distributed  hypermedia  system. 

The  advantage  of  hypertext  is  that  in  a  hypertext  document,  if  you  want  more  information  about  a 
particular  subject  mentioned,  you  can  usually  "just  click  on  it"  to  read  further  detail.  In  fact,  documents 
can  be  and  often  are  linked  to  other  documents  by  completely  different  authors  -  much  like  footnoting, 
but  you  can  get  the  referenced  document  instantly! 

To  access  the  web,  you  run  a  browser  program.  The  browser  reads  documents,  and  can  fetch  documents 
from  other  sources.  Information  providers  set  up  hypermedia  servers  which  browsers  can  get  documents 
from. 

The  browsers  can,  in  addition,  access  files  by  FIT,  NNTP  (the  Internet  news  protocol),  gopher  and  an 
ever-increasing  range  of  other  methods.  On  top  of  these,  if  the  server  has  search  capabilities,  the 
browsers  will  permit  searches  of  documents  and  databases. 

The  documents  that  the  browsers  display  are  hypertext  documents.  Hypertext  is  text  with  pointers  to 
other  text.  The  browsers  let  you  deal  with  the  pointers  in  a  transparent  way  -  select  the  pointer,  and 


you  are  presented  with  the  text  that  is: pointed  to. 

Hypermedia  is  a  superset  of  hypertext  --  it  is  any 'mediunr\  with  pointers  to  other  media.  This  means 
that  browsers  might  not  display  a  text  file,  but  might  display  images  or  sound  or  animations. 

nip  in  TaUe  nf  Contents) 

3.2:  What  is  a  URL? 

URL  stands  for  "Uniform  Resource  Locator".  It  is  a  draft  standard  for  ,spe(rifying  ^n  object  on  the 
Internet,  such  as  a  file  or  newsgroup. 

URLs  look  like  this;  (file:  and  ftp:  URLs  are  synonymous.) 

»    file://wuarchive.wustl.edu/mirrors/msdos/graphics/gifkit.zip 

•  ftp://wuarchive.wustl.edu/mirrors 

•  http;//info.cern.ch;80/default.html 

•  news:alt.hypertext 
»    telnet://dra.com 

The  first  part  of  the  URL,  before  the  colon,  specifies  the  access  method.  The  part  of  the  URL  after  the 
colon  is  interpreted  specific  to  the  access  method.  In  general,  two  slashes  after  the  colon  indicate  a 
machine  name  (machine:port  is  also  valid). 

When  you  are  told  to  "check  out  this  URL",  what  to  do  next  depends  on  your  browser;  please  check  the 
help  for  your  particular  browser.  For  the  line-mode  browser  at  CERN,  which  you  will  quite  possibly 
use  first  via  telnet,  the  command  to  try  a  URL  is  "GO  URL"  (substitute  the  actual  URL  of  course).  In 
Lynx  you  just  select  the  "GO"  link  on  the  first  page  you  see;  in  graphical  browsers,  there's  usually  an 
"Open  URL"  option  in  the  menus. 

(\]f  in  Tahip  nf  Cnntmts) 

3.3:  How  does  WWW  compare  to  gopher  and  WAIS? 

While  all  three  of  these  information  presentation  systems  are  client-server  based,  they  differ  in  terms 
of  their  model  of  data.  In  gopher,  data  is  either  a  menu,  a  document,  an  index  or  a  telnet  connection.  In 
WAIS,  everything  is  an  index  and  everything  that  is  returned  from  the  index  is  a  document.  In  WWW, 
everything  is  a  (possibly)  hypertext  document  which  may  be  searchable. 

In  practice,  this  means  that  WWW  can  represent  the  gopher  (a  menu  is  a  list  of  links,  a  gopher 
document  is  a  hypertext  document  without  links,  searches  are  the  same,  telnet  sessions  are  the  same) 
and  WAIS  (a  WAIS  index  is  a  searchable  page,  returning  a  document  with  no  links)  data  models  as 
well  as  providing  extra  functionality. 

Gopher  and  World  Wide  Web  usage  are  now  running  neck  and  neck,  according  to  the  statistics-keepers 
of  the  Internet  backbone.  (Of  course.  World  Wide  Web  browsers  can  also  access  Gopher  servers,  which 
inflates  the  numbers  for  the  latter.)  This  is  changing  as  WWW  reaches  critical  mass  (usage  of  the 
server  at  CERN  doubles  every  4  months  ~  twice  the  rate  of  Internet  expansion). 

(Up  to  Tah]p  nf  r.nntmts) 

4.1:  Introduction:  how  can  I  access  the  web? 

You  have  three  options:  use  a  browser  on  your  own  machine  (the  best  option),  use  a  browser  that  can  be 
telnetted  to  (not  as  good),  or  access  the  web  by  email  (the  least  attractive,  but  for  some  it's  the  only 
way).  It  is  always  best  to  run  a  browser  on  your  own  machine,  unless  you  absolutely  cannot  do  so;  but  feel 
free  to  telnet  to  a  browser  for  your  first  look  at  the  web,  or  use  email  if  the  telnet  command  does  not 


work  on  your  system  {try  it  first!  The  following  sections  cover  telnetting  to  a  browser  and  obtaining 
your  own  browser;  if  neither  of  these  are  possibleifor  you  (because  you  have  only  an  email-and-news 
connection  to  the  Internet),  here  is  how  to  access  a  web  page  by  email; 

Send  email  to  listserv@inf  o  .  cern .  ch  containing  the  following  single  line.  (What  you  put  on  the 
subject  line  doesn't  matter;  blank  is  OK.  This  line  should  go  in  the  text  of  the  message.)  You  will 
receive  as  a  reply  a  simple  page  intended  to  help  you  learn  more  about  the  Web. 

send  http: //www. earn. net /gnrt/ www. html 

(Up  to  Table  of  Co7ttents) 

4.2:  Browsers  accessible  by  telnet 

An  up-to-date  list  of  these  is  available  on  the  Web  as 

http.7/info.cern.ch/hypertext/WWW/FAO/Boots trap.html  and  should  be  regarded  as  an 
authoritative  list. 

info.cern.ch 

No  password  is  required.  This  is  in  Switzerland,  so  continental  US  users  might  be  better  off  using 
a  closer  browser. 

www.cc.ukans.edu 

A  full  screen  browser  "Lynx"  which  requires  a  vtlOO  terminal.  Log  in  as  www.  Does  not  allow 
users  to  "go"  to  arbitrary  URLs,  so  GET  YOUR  OWN  COPY  of  Lynx  and  install  it  on  your  system 
if  your  administrator  has  not  done  so  already.  The  best  plain-text  browser,  so  move  mountains  if 
necessary  to  get  your  own  copy  of  Lynx! 

vvTvw.njit.edu 

(or  telnet  128.235.163.2)  Log  in  as  www.  A  full-screen  browser  in  New  Jersey  Institute  of 
Technology.  USA. 

www.huji.ac.il 

A  dual-language  Hebrew /English  database,  with  links  to  the  rest  of  the  world.  The  line  mode 
browser,  plus  extra  features.  Log  in  as  www.  Hebrew  University  of  Jerusalem,  Israel. 

sun.uakom.es 

Slovakia.  Has  a  slow  link,  only  use  from  nearby. 
info.funet.fi 

(or  telnet  128.214.6.102).  Log  in  as  www.  Offers  several  browsers,  including  Lynx  (goto  option  is 

disabled  there  also). 

fserv.kfki.hu 

Hungary.  Has  slow  link,  use  from  nearby.  Login  is  as  www. 

(Up  to  Table  of  Contents) 

4.3:  Obtaining  browsers 

The  preferred  method  of  access  of  the  Web  is  to  run  a  browser  yourself.  Browsers  are  available  for 
many  platforms,  both  in  source  and  executable  forms.  Here  is  a  list  generated  from  the  authoritative 
list,  http://info.cern.ch/hypertext/WWW/Clients.html. 

(Up  to  Table  of  Contents)  ''' 

4.3.1:  Microsoft  Windows  browsers 

NOTE:  These  browsers  require  that  you  have  SLIP,  PPP  or  other  TCP/IP  networking  on  your  PC.  SLIP  or 


PPP  can  be  accomplished  over  phone  lines,  but  only  with  the  active  cooperation  of  your  network 
provider  or  educational  institution.  If  yo^u  only  have  normal  dialup  shell  access,  your  best  option  at 
this  time  is  to  run  Lynxbn  the  Unix  (or  VMS,  or...)  system  you  call,  or  telnet  tP  ^  browser  if  you  cannot 
do  so. 
Cello 

Browser  from  Cornell  LII.  Available  by  anonymous  FTP  from  ftp.law-c-prnell.i^du  in  the  directory 

/pub/m/cello. 
Mosaic  for  Windows 

From  NCSA.  Available  by  anonymous  FTP  from  ftp.nrsa.uiuc.edu  in  the  directory 

PC/Windows/Mosaic.  (Uji  to  Table  of  Contents) 

4.3.2:  MSDOS  browsers 

NOTE:  These  browsers  require  that  you  have  SLIP,  PPP  or  other  TCP/IP  networking  on  your  PC. 
SLIP  or  PPP  can  be  accomplished  over  phone  lines,  but  only  with  the  active  cooperation  of  your 
network  provider  or  educational  institution.  If  you  only  have  normal  dialup  shell  access,  your 
best  option  at  this  time  is  to  run  Lynx  on  the  Unix  (or  VMS,  or...)  system  you  call,  or  tf^ln^t  to  ^  brQWS^r 
if  you  cannot  do  so. 

DosLynx 

DosLynx  is  an  excellent  text-based  browser  for  use  on  DOS  systems.  You  must  have  a  level  1 
packet  driver,  or  an  emulation  thereof,  or  you  will  only  be  able  to  browse  local  files;  essentially, 
if  your  PC  has  an  Ethernet  connection,  or  you  have  SLIP,  you  should  be  able  to  use  it.  DosLynx  can 
view  GIF  images,  but  not  when  they  are  inline  images  (as  of  this  writing).  See  the 
README.HTM  file  at  the  DosLynx  site  for  details.  You  can  obtain  DosLynx  by  anonymous  FTP 
from  ftp2.cc.ukans.edu  in  the  directory  pub/WWW/DosLynx;  the  URL  is 
ftp://ftp2.cr.ukans.edu/pub/WWW/DosLynx/. 

(Up  to  Table  of  Contents) 
4.3.3:  Macintosh  browsers 

NOTE:  These  browsers  require  that  you  have  SLIP,  PPP  or  other  TCP/IP  networking  on  your  PC.  SLIP  or 
PPP  can  be  accomplished  over  phone  lines,  but  only  with  the  active  cooperation  of  your  network 
provider  or  educational  institution.  If  you  only  have  normal  dialup  shell  access,  your  best  option  at 
this  time  is  to  run  Lynx  on  the  Unix  (or  VMS,  or...)  system  you  call,  or  felnef  tO  3  browser  if  you  cannot 
do  so. 

Mosaic  for  Macintosh 

From  NCSA.  Full  featured.  Available  by  anonymous  FTP  from  ftp.nc9a.uiuc.ed.u  in  the  directory 
Mac/Mosaic. 

Samba 

From  CERN.  Basic.  Available  by  anonymous  FTP  from  info-cern.ch  in  the  directory 
/ftp/pub/vAVw/bin  as  the  file  mac. 

MacWeb 

From  EINet.  Has  features  that  Mosaic  lacks;  lacks  some  features  that  Mosaic  has.  Available  by 
anonymous  FTP  from  ftp.einet.net  in  the  directory  einet/mac/macweb. 

nip  to  Table  of  Contents) 
4:3 A:  Amiga  browsers 

AMosaic 


Browser  for  AmigaOS,  based  on  NJjCSA's  Mosaic.  Supports  older  Amigas  as  well  as  the  newer 
machines  in  the  latest  versions,  I  am  told;  available  for  anonymous  ftp  from 
max.physics.sunysb.edu  in  the  directory  /pub/amosaic,  or  from  aminet  sites  in 
/pub/aminet/comm/net.  see  the  site  for  details.  See  the  URL 
http://insti.physics.sunysb.edu/AMosaic/home.html. 

Emacs-W3 

The  Emacs-W3  browser  works  under  Gnu  Emacs  on  the  Amiga  (see  section  4.3.7). 

(Up  to  Table  of  Contents) 

4.3.5:  NeXTStep  browsers 

Note:  NeXT  systems  can  also  run  X-based  browsers  using  one  of  the  widely  used  X  server  products  for 
the  NeXT.  The  browsers  listed  here,  by  contrast,  are  native  NeXTStep  applications. 

OmniWeb 

A  World  Wide  Web  browser  for  NeXTStep.  The  URL  for  more  information  is 

http:  /  /www.omnigroup .com / ;  you  can  ftp  the  package  from  ftp.omnigroup.com  in  the 

/pub/software/  directory. 

WorldWideWeb,  CERN's  NeXT  Browser-Editor 

A  browser/editor  for  NeXTStep.  Currently  out  of  date;  editor  not  operational.  Allows  Wysiwyg 
hypertext  editing.  Requires  NeXTStep  3.0.  Available  for  anonymous  FTP  from  info.cern.ch  in  the 
directory  /pub/www/src. 

(Up  to  Table  of  Contents) 

4.3.6:  X/DecWindows  (graphical  UNIX,  VMS)  browsers 

NCSA  Mosaic  for  X 

Unix  browser  using  Xll /Motif.  Multimedia  magic.  Full  http  1.0  support  including  PUT-method 
forms,  image  maps,  etc.  Recommended  if  you  can  run  it.  Available  by  anonymous  FTP  from 
ftp.ncsa.uiuc.edu  in  the  directory  Mosaic. 

NCSA  Mosaic  for  VMS 

Browser  using  Xll/DecWindows/Motif.  For  the  VMS  operating  system.  Multimedia  magic.  Full 
http  LO  support  including  PUT-method  forms,  image  maps,  etc.  Recommended  if  you  can  run  it. 
Available  by  anonymous  FTP  from  ftp.ncsa.uiuc.edu  in  the  directory  Mosaic. 

Ouadralay  GWHIS  Viewer  (Commercial  Mosaic) 

Quadralay  offers  a  commercial-grade  (not  free!)  version  of  Mosaic  for  Unix  systems,  with 
Windows  and  Macintosh  versions  expected  in  the  future.  (URL  i§: 
http : / /www . quadralay . com/products /products . hbml#awhis) 

tkWWW  Browser/F.ditnr  for  Xll 

Unix  Browser/Editor  for  Xll.  (Beta  test  version.)  Available  for  anonymous  ftp  from 
harbor.ecn .purdue.edu  in  the  directory  tkwrvvw[extension]  (followed  by  an  extension  possibly 
dependent  on  the  current  version).  Please  ftp  to  the  site  and  look  for  the  latest  version  (or  use  the 
link  above).  Supports  WSYIWYG  HTML  editing. 

Midas  WWW  Browser 

A  Unix/X  browser  from  Tony  Johnson.  (Beta,  works  well.)  "^ 

Viola  for  X  (Beta) 

Viola  has  two  versions  for  Unix/X:  one  using  Motif,  one  using  Xlib  (no  Motif).  Handles  HTML 
Level  3  forms  and  tables.  Has  extensions  for  multiple  columning,  collapsible/expandable  list. 


client-side  document  include.  Available  by  anonymous  FTP  from  ora.com  in  /pub /www/viola. 
More  information  available  at  the  URL        >., 
http://xcf.berkeley.edu  /ht /projects /viola /README. 

Chimera 

Unix/X  Browser  using  Athena  (doesn't  require  Motif).  Supports  forms,  inline  images,  etc.;  closest 
to  Mosaic  in  feel  of  the  non-Motif  Xll  browsers.  Available  for  anonymous  FTP  from 
ftp.cs.unlv.edu  in  the  directory  /pub /chimera. 

(Up  to  Table  of  Contents) 

4.3.7:  Text-mode  Unix  and  VMS  browsers 

These  are  text-based  browsers  for  Unix  (and  in  some  cases  also  VMS)  systems.  In  many  cases  your 
system  administrator  will  have  already  installed  one  or  more  of  these  packages;  check  before 
compiling  your  own  copy. 

Tine  Mode  Browser 

This  program  gives  W3  readership  to  anyone  with  a  dumb  terminal.  A  general  purpose 

information  retrieval  tool.  Available  by  anonymous  ftp  from  info.cern.ch  in  the  directory 

/pub/www/src. 
The  "Lynx"  full  screen  browser 

This  is  a  hypertext  browser  for  vtlOOs  using  full  screen,  arrow  keys,  highlighting,  etc.  Available 

by  anonymous  FTP  from  ftp2.cc.ukans.edu. 
Tom  Fine's  perlWWW 

A  tty-based  browser  written  in  perl.  Available  by  anonymous  FTP  from 

archive.cis.ohio-state.edu  in  the  directory  pub /w3browser  as  the  file  wSbrowser-O.l.shar. 

For  VMS 

Dudu  Rashty's  full  screen  client  based  on  VMS's  SMG  screen  management  routines.  Available  by 
anonymous  FTP  from  vms.huji.ac.il  in  the  directory  www/www_client. 

Emacs  w3-mode 

W3  browse  mode  for  emacs.  Uses  multiple  fonts  when  used  with  Lemacs  or  Epoch.  See  the 
documentation.  Available  by  anonymous  FTP  from  moose.cs.indiana.edu  in  the  directory 
pub/elisp/w3  as  the  files  w3.tar.Z  and  extras.tar.Z. 

(Up  to  Table  of  Contents) 

4.3.8:  Batch-Mode  "Browsers" 

Batch  mode  browser 

A  batch-mode  "browser",  url_get,  which  is  available  through  the  URL 

http://v\wwhost.rr.utexas.edu /test/zippy /ur1  get.html.  It  can  be  retrieved  via  anonymous  FTP 
to  ftp.cc.utexas.edu,  as  the  file  /pub /zippy /urLget.tar.Z.  This  package  is  intended  for  use  in  cron 
jobs  and  other  settings  in  which  fetching  a  page  in  a  command-line  fashion  is  useful. 

(Up  to  Table  of  Contents) 

4.4:  How  can  I  access  the  web  through  a  firewall? 

For  information  on  using  NCSA  Mosaic  from  behind  a  firewall,  please  read  the  following.  In  general, 
browsers  can  be  made  useful  behind  firewalls  through  the  use  of  a  package  called  "SOCKS";  the 
source  must  be  modified  slightly  and  rebuilt  to  accommodate  this.  Whenever  possible,  work  with  your 


network  administrators  to  solve  the  problem,  not  against  them. 

li 

An  excerpt  from  the  NCSA  Mosaic  FAQ: 

NCSA  Mosaic  requires  a  direct  internet  connection  to  work,  but  some  folks  have  put  together  a  package 
that  works  behind  firewalls.  This  is  completely  unsupported  by  NCSA,  but  here  is  the  latest 
announcement; 

November  15, 1993:  C&C  Software  Technology  Center  (CSTC)  of  NEC  Systems  Lab  has  made 
available  a  version  of  SOCKS,  a  package  for  running  Internet  clients  from  behind  firewalls 
without  breaching  security  requirements,  that  includes  a  suitably  modified  version  of  Mosaic  for 
X  2.0.  Beware:  such  a  version  is  not  supported  by  NCSA;  we  can't  help  with  questions  or  problems 
arising  from  the  modifications  made  by  others.  But,  we  encourage  you  to  check  it  out  if  it's 
interesting  to  you.  Questions  and  problem  notifications  can  be  sent  to  Ying-Da  Lee  ( 
ylee@syl.dl.nec.com). 

(Up  to  Table  of  Contents) 

4.5:  What  is  on  the  web? 

Currently  accessible  through  the  web: 

®  anything  served  through  gopher 

»  anything  served  through  WAIS 

«  anything  on  an  FTP  site 

»  anything  on  Usenet 

»  anything  accessible  through  telnet 

®  anything  in  hytelnet 

»  anything  in  hyper-g 

»  anything  in  techinfo 

»  anything  in  texinfo 

9  anything  in  the  form  of  man  pages 

«  sundry  hypertext  documents 

(Up  to  Table  of  Contents) 

4.5.1:  How  do  I  find  out  what's  new  on  the  web? 

The  unofficial  newspaper  of  the  World  Wide  Web  is  What's  New  With  NCSA  Mosaic  (URL  is 
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/whats-new.html),  which  carries 
announcements  of  new  servers  on  the  web  and  also  of  new  web-related  tools.  This  should  be  in  your  hot 
list  if  you're  not  using  Mosaic  (which  can  access  it  directly  through  the  help  menu). 

(Up  to  Table  of  Contents) 

4.5.2:  Where  is  the  subject  catalog  of  the  web? 

There  are  several.  There  is  no  mechanism  inherent  in  the  web  which  forces  the  creation  of  a  single 
catalog  (although  there  is  work  underway  on  automatic  mechanisms  to  catalog  web  sites).  The 
best-known  catalog,  and  the  first,  is  Ihe...kVW:.W..;^iit:t.U3.LLi.b.i:3.iy.  (URL  is 

http://info.cem.ch/hypertext/DataSources/bySubiect/Overview.html),  maintained  by  CERN.  The 
Virtual  Library  is  a  good  place  to  find  resources  on  a  particular  subject,  and  has  separate  maintainers 
for  many  subject  areas. 

There  is  also  a  newer  cataloging  system  called  ALIWEB  that  requires  very  little  effort  to  maintain 


and  is  growing  rapidly  (URL  is  http://web.nexor.co.uk/aliweb/doc/aliweb.l\tml). 

(Up  in  Table  of  Contents) 

4.5.3:  How  can  I  search  through  ALL  web  sites? 

Several  people  have  written  robots  which  create  indexes  of  web  sites  --  including  sites  which  have  not 
arranged  to  be  mentioned  in  the  newspapers  and  catalogs  above.  (Before  writing  your  own  robot,  please 
read  the  section  on  robots.) 

Here  are  a  few  such  automatic  indexes  you  can  search: 

«    WebCrawler  rURL  is  http://www.biotech.washington.edu/WebOuery.html)  builds  an 
impressively  complete  index;  on  the  other  hand,  since  it  indexes  the  content  of  documents,  it 
may  find  many  links  that  aren't  exactly  what  you  had  in  mind. 

«>    World  Wide  Web  Worm  (URL  is 

http:  /  /www.cs.colorado.edu  /home  /mcbryan  / WWWV\^html)  builds  its  index  based  on  page 
titles  and  URL  contents  only.  This  is  somewhat  less  inclusive,  but  pages  it  finds  are  more 
likely  to  be  an  exact  match  with  your  needs. 

You  can  read  about  other  robots  in  the  robots  section. 
(Up  to  Table  of  Contents) 

4.6:  How  can  I  save  an  inline  image  to  disk? 

Here  are  two  ways: 

1.  Turn  on  "load  to  local  disk"  in  your  browser,  if  it  has  such  an  option;  then  reload  images.  You'll  be 
prompted  for  filenames  instead  of  seeing  them  on  the  screen.  Be  sure  to  shut  it  off  when  you're  done 
with  it. 

2.  Choose  "view  source"  and  browse  through  the  HTML  source;  find  the  URL  for  the  inline  image  of 
interest  to  you;  copy  and  paste  it  into  the  "Open  URL"  window.  This  should  load  it  into  your  image 
viewer  instead,  where  you  can  save  it  and  otherwise  muck  about  with  it. 

(Up  to  Table  of  Contents) 

4.7:  How  can  I  get  sound  from  the  PC  speaker  with  WinMosaic? 

This  piece  of  wisdom  donated  by  Hunter  Monroe: 

This  section  explains  how  to  install  sound  on  a  PC  which  already  has  a  working  version  of  Mosaic  for 
Microsoft  Windows.  Be  warned  in  advance  that  the  results  may  be  poor. 

To  get  Mosaic  to  produce  sound  out  of  the  PC  speaker,  first,  you  need  a  driver  for  the  speaker.  You  can 
get  the  Mirrnsoft  speaker  driver  from  the  URL 

ftp://ftp.microsoft.com/Softlib/MSLFILES/SPEAK.EXE  or  by  doing  an  Archie  search  to  find  it 
somewhere  else.  SPEAK.EXE  is  a  self-extracting  file.  Copy  the  speak.exe  file  to  a  new  directory,  and 
then  type  "SPEAK"  at  the  DOS  prompt.  Do  not  put  the  file  SPEAKER.DRV  in  a  separate  directory 
from  OEMSETUP.INF. 

Now,  you  need  to  install  the  driver.  In  Windows,  from  the  Program  Manager  choose  successively 
Main/Control  Panel/Drivers/ Add /Unlisted  or  updated  drivers /(enter  path  of  SPEAK.EXE)/PC 
Speaker.  At  this  point  some  strange  sounds  come  out  as  the  driver  is  initialized.  Change  the  settings  to 
improve  the  sound  quality  on  the  various  sounds:  tr  da,  chimes,  etc.  Click  OK  when  you  are  finished 
and  choose  the  Restart  windows  option. 

Having  installed  the  speaker  driver,  you  will  now  get  sounds  whenever  you  start  Windows,  make  a 


mistake,  or  exit  Windows.  If  you  do  not  want  this,  from  thie  Main /Control  Panel/Sounds  menu,  make 
sure  there  is  no  X  next  to  "Enable  Systerh  Sounds.'' 

Now,  you  need  a  sound  viewer  program  that  Mosaic  can  call  to  display  sounds.  NCSA  unfortunately 
recommend  WHAM,  which  does  not  work  well  with  a  PC  speaker.  Get  the  program  WPLANY 
instead.  You  can  find  a  copy  nearby  with  an  Archie  search  on  the  string  "wplny";  the  current  version  is 
WPLNY09B.ZIP.  For  details  on  archie  and  other  basic  issues  related  to  FTP,  please  read  the  Usenet 
newsgroup  news.announre.newusers. 

Move  the  zip  file  to  a  new  directory,  and  use  an  unzip  program  like  pkunzip  to  unzip  it,  producing  the 
files  WPLANY.EXE  and  WPLANY.DOC.  Then  edit  the  MOSAIC.INI  file  to  remove  the  "REM" 
before  the  line  "TYPE9=audio/basic".  Then,  you  need  lines  in  the  section  below  that  read  something 
like:  audio/basic="c:\wplany\ wplany.exe  %ls"  audio/wav="c:\wplany\wplany.exe  %ls"  where 
you  have  filled  in  the  correct  path  for  wplany.exe.  The  MOSAIC.INI  file  delivered  with  Mosaic  may 
have  NOTEPAD.EXE  on  the  audio/basic  line,  but  this  will  not  work.  Now,  restart  Mosaic,  and  you 
should  now  be  able  to  produce  sounds.  To  check  this,  with  Mosaic  choose  File/Local 
File/\WINDOWS\*.WAV  and  then  try  to  play  TADA.WAV.  Then,  you  might  try  the  Mosaic  Demo 
document  for  some  .AU  sounds,  but  you  are  lucky  if  your  speaker  produces  something  you  can  understand. 

(Uv  to  Table  of  Contents) 

4.8: 1  have  a  Windows  PC  or  Macintosh.  Why  can't  I  access  WAIS  URLs? 

This  answer  provided  by  Michael  Grady  (m-grady@uiuc.edu); 

The  version  of  Mosaic  for  X  has  "wais  client"  code  built-in  to  it.  This  was  relatively  easy  for  the 
developers  to  do,  because  there  was  already  a  set  of  library  routines  for  talking  to  WAIS  available  for 
Unix  as  "public  domain"  (freeWAIS).  I  don't  think  there  is  such  a  library  of  routines  for  PC/Windows 
or  Mac,  which  would  make  it  much  more  difficult  for  the  Mosaic  versions  for  Windows  and  the  Mac  to 
add  "wais  client"  capability.  Therefore,  at  least  for  now,  neither  the  Windows  or  Mac  versions  of 
Mosaic  support  direct  query  of  a  WAIS  server  (i.e.  can  act  as  wais  clients  themselves). 

(Up  to  Table  of  Contents) 

4.9:  I'm  running  XMosaic.  Why  can't  I  get  external  viewers  working... 

...  No  matter  what  no  matter  what  I  do  to  my  .mailcap  and  .mime. types  files? 

Answer  provided  by  Ronald  E.  Daniel  (rdaniel@acl.lanl.gov): 

Mosaic  only  looks  at  the  .mime. types  file  if  it  has  no  idea  what  the  document's  type  is.  This  is 
actually  a  very  rare  situation.  Essentially  all  servers  now  use  the  HTTP/1.0  protocol,  which 
means  that  they  tell  Mosaic  (or  other  browsers)  what  the  document's  MIME  Content-type  is.  The 
servers  use  a  file  very  much  like  Mosaic's  .mime.types  file  to  infer  the  Content-type  from  the 
filename's  extension. 

It  is  pretty  simple  to  find  out  if  this  really  is  the  problem.  Use  telnet  to  talk  to  the  server  and 
find  out  if  it  is  assigning  a  MIME  type  to  the  document  in  question.  Here's  an  example,  looking  at 
the  home  page  for  my  server,  (idaknow:  is  my  shell  prompt) 

idaknow:  telnet  www.acl.lanl.gov  80   //  Connect  to  the  httpd  server 

Trying  128.165.148.3  . . . 

Connected  to  wwvj.acl.lanl.gov.  ^ 

Escape  character  is  ''"]'. 

HEAD  /Home. html  HTTP/ 1.0  //  replace  Home.html  with  your  document 

//  you  supply  the  blank  line 
HTTP/1.0  200  OK  //  the  rest  of  this  comes  from  the  server   Date:  Wednesday,  25-May-94 


Server:  NCSA/ 1.1 

MIME-version:  1.0  '  ,(' 

content-type:  text/html  //  Here's  the  MIME  Content-type 

Last-modified:  Monday,  16-May-94  16:21:58  GMT 

Content- length:  1727 

Connection  closed  by  foreign  host, 
idaknow : 

In  the  example  above,  /Home.html  will  get  http;//www.acl.lanl.gov/Home.html. 

Normally  servers  will  be  configured  to  supply  a  Content-type  of  text/plain  if  they  don't  know 
what  else  to  do.  If  this  is  the  problem  you  are  having,  take  a  look  at  the  Typ^gConfig 
documentation  for  NCSA's  httpd.  You  can  have  the  server  look  at  the  filename  extension,  supply 
the  correct  Content-type,  then  use  your  local  .mailcap  file  to  tell  Mosaic  what  viewer  to  use  to 
look  at  the  document. 

Russ  Segal  adds: 

The  answer  from  Ronald  Daniel  is  essentially  correct,  but  it  needs  a  small  addendum. 
When  starting  Moasic,  you  can  specify  a  "fileProxy"  which  will  fetch  files  for  you: 
"*fileProxy:    http ; //socks/ " 

If  you  do  this,  file;  URLs  are  no  longer  strictly  local  accesses.  So  even  if  the  URL  is  not  fttp:,  the 
proxy  server  must  be  upgraded  as  Mr.  Daniel  suggests. 

(Up  to  Tabic  of  Contents) 

4.10:  Hey,  I  know,  I'll  write  a  WWW-exploring  robot!  Why  not? 

Programs  that  automatically  traverse  the  web  can  be  quite  useful,  but  have  the  potential  to  make  a 
serious  mess  of  things.  Robots  have  been  written  which  do  a  "breadth-first"  search  of  the  web, 
exploring  many  sites  in  a  gradual  fashion  instead  of  aggressively  "rooting  out"  the  pages  of  one  site  at 
a  time.  Some  of  these  robots  now  produce  excellent  indexes  of  information  available  on  the  web. 

But  others  have  written  simple  depth-first  searches  which,  at  the  worst,  can  bring  servers  to  their 
knees  in  minutes  by  recursively  downloading  information  from  CGI  script-based  pages  that  contain  an 
infinite  number  of  possible  links.  (Often  robots  can't  realize  this!)  Imagine  what  happens  when  a  robot 
decides  to  "index"  the  CONTENTS  of  several  hundred  mpeg  movies.  Shudder. 

The  moral:  a  robot  that  does  what  you  want  may  already  exist;  if  it  doesn't,  please  study  the  document 

World  Wide  Web  Robots.  Wanderprs  ;^nd  Spiders  (URL  is: 

http:  /  /web.nexor.ro.uk /mak /doc /robots  /robots.html)  and  learn  about  the  emerging  standards  for 

exclusion  of  robots  from  areas  in  which  they  are  not  wanted.  You  can  also  read  about  existing  robots 

there. 

(Up  to  TaUe  nf  Contents) 

4.11:  How  do  I  send  newsgroup  posts  in  HTML  to  my  web  client? 

How  to  do  this  depends  greatly  on  your  system;  if  you  have  a  Mac  or  Windows  system,  the  answer  is 
completely  different.  But,  as  food  for  thought,  here  is  a  simple  shell  script  I  use  on  my  Unix  account  to 
send  posts  from  rn  and  related  newsreaders  to  Ljmx.  Put  this  tex|  in  the  file  "readwebpost"  and  use  the 
"chmod"  command  to  make  it  executable,  then  put  it  somewhere  in  your  path  (such  as  your  personal  bin 
directory); 

#! /bin/sh 


echo  \<PRE\>  >  .artiole.html 

cat  »  .article.html  '' 

echo  \</PRE\>  >>  .article.html 

lynx  .article.html  <  /dev/tty 

rm  .article.html 

Then  add  the  following  line  to  your  .rnmac  file  (create  it  if  you  don't  already  have  one): 

W  |readwebpost   %C 

Now,  when  you  press  "W"  while  reading  a  post  in  rn,  a  message  will  be  sent  to  Lynx,  and  the  links 
enclosed  in  it  will  be  live. 

Larry  W.  Virden  provides  the  following  version  which  invokes  Mosaic  instead,  and  is  also  capable  of 
communicating  with  an  already-running  copy  of  Mosaic  instead  of  launching  another.  (You  can  use  the 
same  rn  macro  as  above,  invoking  "goto-xm"  instead  of  "readwebpost".)  Read  the  comments  for  details 
on  the  assumptions  made  by  the  script. 

#!  /bin/sh 

#  goto-xm,  by  Joseph  T.  Buck 

#  Modified  heavily  by  Larry  W.  Virden 

#  Script  for  use  with  newsreaders  such  as  trn.   Piping  the  article 

#  through  this  command  causes  xmosaic  to  pop  up,  pointing  to  the 

#  article.   If  an  existing  xmosaic  (version  1.1  or  later)  exists, 

#  the  USRl  method  will  be  used  to  cause  it  to  point  to  the  correct 

#  article,  otherwise  a  new  one  will  be  started. 

#  assumptions:  ps  command  works  as  is  on  SunOS  4.1.x,  may  need  changes 

#  on  other  platforms. 

URL= Vbin/grep  "^Message-ID:'  |  /bin/sed  -e  ' s/ . *</news : / '  -e  's/>.*//'" 
if  [  "X$URL"  =  "X"  ] ;  then 

echo  "USAGE:  $0  [goto]  [once]  <  USENET_msg"  >&2 

exit  1 
fi 

pid="ps  -xc  I  egrep  '(Mm]osaic'  |  awk  'NR  ==  1  {print  $1)'" 

p=^which  Mosaic^ 

gf ile=/tmp/Mosaic . $pid 

$p  " $URL "  & 

if       [  "$#"  -gt  0  ]  ;  then 

if     ["$!"=  "goto"  -o  "$1"  =  "same"  ]  ;  then 
shift 

echo  "goto"    >  $gfile 
else 

echo  "newwin"  >  $gfile 
fi 
else 

echo  "newwin"  >  $gfile 
f  i 
/bin/awk  'END  {  printf  $URL }'  </dev/null  >>  $gfile 


trap  "echo  signal  encountered"  30  ; 
kill  -USRl  $pid 


exit  0 

a  In  to  Tahle  of  Contents) 


5.1:  Introduction:  How  can  I  provide  information  to  the  web? 

Information  providers  run  programs  that  the  browsers  can  obtain  hypertext  from.  These  programs  can 
either  be  WWW  servers  that  understand  the  HyperText  Transfer  Protocol  HTTP  (best  if  you  are 
creating  your  information  database  from  scratch),  "gateway"  programs  that  convert  an  existing 
information  format  to  hypertext,  or  a  non-HTTP  server  that  WWW  browsers  can  access  --  anonymous 
FTP  or  gopher,  for  example. 

To  learn  more  about  World  Wide  Web  servers,  you  can  consult  a  www  sender  primer  by  N&than 
Torkington,  available  at  the  URL 
http://www.vuw.ac.nz/who/Nathan.Torkington/ideas/www-servers.html. 

If  you  only  want  to  provide  information  to  local  users,  placing  your  information  in  local  files  is  also  an 
option.  This  means,  however,  that  there  can  be  no  off-machine  access. 

(Up  to  Tahle  of  Contents) 

5.2:  Obtaining  Servers 

Servers  are  available  for  Unix,  Macintosh,  MS  Windows,  and  VMS  systems.  If  you  know  of  a  server  for 
another  operating  system,  please  contact  me. 

See  http:  /  /info.cern.ch  /hypertext  /WWW  /Daemon  /Overview.html  for  more  information  on  writing 
servers  and  gateways  in  general. 

(Up  to  Table  of  Contents) 

5.2.1:  Unix  Servers 

NCSA  httpd 

NCSA  has  released  a  server,  known  as  the  NCSA  httpd;  it  is  available  at  the  URL 
fl-p-//ftp.nrsa.uiur.edu/Weh/ncsa  httpd. 

CERN  httpd 

CERN's  server  is  available  for  anonymous  FTP  from  info.cern.ch  (URL  is 

httu:  //info.cern.ch/hvpertext/WW/Daemon /Status  .html)  and  many  other  places. 

Use  your  local  copy  of  archie  to  search  for  "www"  in  order  to  find  a  nearby  site. 

GN  Gopher/HTTP  server 

The  GN  server  is  unique  in  that  it  can  serve  both  WWW  and  Gopher  clients  (in  their  native 
modes).  This  is  a  good  server  for  those  migrating  from  Gopher  to  WWW,  although  it  does  not 
have  the  server-side-script  capabilities  of  the  NCSA  and  CERN  servers.  See  the  URL 
http://hopf.math.nwu.edu/. 

Perl  server 

There  is  also  a  server  written  in  the  Perl  scripting  language,  called  Plexus,  for  which 
documentation  is  available  at  the  URL  http://bsdi.com/server/doc/plexus.html. 

(Up  to  Tahle  of  Contents) 
5.2.2:  Macintosh  Servers 

There  is  a  server  for  the  Macintosh,  MacHTTP,  available  at  the  URL 


There  is  a  server  for  the  Macintosh,  M'acHTTP,., available  at  the  URL 
http://www.uth. tmc.'edu/macjnfo/machttpjnfo.html. 

nip  to  Table  of  Contents) 

5.2.3:  MS  Windows  and  Windows  NT  Servers 

HTTPS  (Windows  NT) 

HTTPS  is  a  server  for  Windows  NT  systems,  both  Intel  and  Alpha  --  based.  It  is  available  via 

anonymous  FTP  from  emwac.ed.ac.uk  in  the  directory  pub/https  (URL  is 

ftp:  /  /emwac.ed .ac.uk  /pub /https).  (Be  sure  to  download  the  version  appropriate  to  your 

processor.)  You  can  read  a  detailed  announcement  at  the  FTP  site,  or  by  using  the  URL 

ftp://emwac.ed.ar.uk/pub/https/https.txt. 

NCSA  httpd  for  Windows 

The  NCSA  httpd  for  Windows  has  most  of  the  features  of  the  Unix  version,  including  scripts 
(which  generate  pages  on  the  fly  based  on  user  input).  It  is  available  by  anonymous  FTP  from 
ftp.ncsa.uiuc.edu  in  the  Web/ncsa_httpd/contrib  directory  as  the  file  whtplla6.zip,  or  at  the 
URL  ftp://ftp.ncsa.uiuc.edu/Web/ncsa  httpd/contrib/whtplla6.zip. 

SerWeb 

A  simple,  effective  server  for  Windows  writtten  by  Gustavo  Estrella.  Available  by  anonymous 
ftp  from  winftp.cica.indiana.edu  (or  one  of  its  mirror  sites,  such  as  nic.switch.ch),  as  the  file 
serweb03.zip,  in  the  directory  /pub/pc/win3/winsock. 

There  is  also  a  Windows  NT  version  of  SerWeb,  available  by  anonymous  FTP  from  emwac.ed.ac.uk  as 

/pub/serweh/.serweb  i.zip. 

WEB4HAM 

Another  Windows-based  server,  available  by  anonymous  FTP  from 

ftp.informatik.uni-hamburg.de  as  /pub /net/winsock/web4ham.zip. 

rUp  to  Table  of  Contents) 

5.2.4:  MSDOS  Servers 

KA9Q  NOS  (nosllc.exe)  is  a  internet  server  package  for  DOS  that  includes  FITTP  and  Gopher  servers. 
It  can  be  obtained  via  anonymous  FTP  from  one  of  the  following  sites: 

inorganic5.chem.ufl.edu 
biochemistry. cwru. edu 

(Up  to  Table  of  Contents) 
5.2.5:  VMS  Servers 

CERN  HTTP  for  VMS 

A  port  of  the  CERN  server  to  VMS.  Available  at  the  URL 
http://delonline.cern.ch/disk$user/duns/doc/vms/distribution.html. 

Region  6  Threaded  HTTP  Server 

A  native  VMS  server  which  uses  DECthreads(tm).  This  is  a  potentially  major  performance 
advantage  because  VMS  has  a  high  overhead  for  each  process,  which  is  a  problem  for  the 
frequently-forking  NCSA  and  CERN  servers  that  began  life  under  Unix.  A  multithreaded  server 
avoids  this  overhead.  Available  at  the  URL 
http://kcgll.eng.ohio-state.edu/www/doc/serverinfo.html. 


(Up  to  Table,  of  Contents')  '' 

5.2.6:  Amiga  Servers 

NCSA's  Unix  server  has  been  ported  to  the  Amiga,  and  is  bundled  with  the  AMosaic  browser.  See  the 
URL  http://insti.physics.sunysb.edu/AMosaic/home.html  for  details. 

(Uf  to  Table  of  Contents) 

5.3:  Producing  HTML  documents 

HTML  is  the  simple  markup  system  used  to  create  hypertext  documents.  There  are  three  ways  to 
produce  HTML  documents:  writing  them  yourself,  which  is  not  a  very  difficult  skill  to  acquire,  using 
an  HTML  editor,  which  assists  in  doing  the  above,  and  converting  documents  in  other  formats  to 
HTML.  The  following  three  sections  cover  these  possibilities  in  sequence. 

(]Jf  to  Table  of  Contents) 

5.3.1:  Writing  HTML  documents  yourself 

You  can  write  an  HTML  document  with  any  text  editor.  Try  the  "source"  button  of  your  browser  (or 
"save  as"  HTML)  to  look  at  the  HTML  for  a  page  you  find  particularly  interesting.  The  odds  are  that 
it  will  be  a  great  deal  simpler  than  you  would  expect.  If  you're  used  to  marking  up  text  in  any  way 
(even  red-pencilling  it),  HTML  should  be  rather  intuitive. 

A  beginner's  guide  to  HTML  is  available  at  the  URL 

http://www.ncsa.uiuc.edu/General/Internet/VVWW/HTMLPrimer.html.  You  can  also  find  a  plain 
text  version  (at  the  URL  ftp://ftp.ncsa.uinr.edu/nrsapuhs/WWW/HTMLFrimer.txt)  and  a 
compressed  Postscript  version  (at  the  URL 
ftp://ftp.ncsa.niur.edu/ncsapi3bs/WWW/HTMLPrimer.ps.Z).  (Since  the  latter  two  are  FTP  URLs,  you 

can  fetch  them  by  hand  using  FTP  if  you  do  not  yet  have  a  web  browser.) 

There  is  also  a  good  set  of  HTML  documentation  available  at  the  URL 
http://www.ucc.ie/info/net/htmldoc.html. 

There  is  also  an  HTML  primer  by  Nathai\  Torkington  at  the  URL 

http :  /  /  www.vuw  .ac  .nz  /who  /Nathan  .Torkington  /ideas  /www-html.html. 

(Uf  to  Table  of  Contents) 

5.3.2:  HTML  editors 

Of  course,  most  folks  would  still  prefer  to  use  a  friendlier,  graphical  editor.  Some  editors  are 
WYSIWYG  (What  You  See  Is  What  You  Get),  or  close  to  it;  others  simply  assist  you  in  writing  HTML 
by  plugging  in  the  desired  markup  tags  for  you  from  a  menu. 

Fans  of  the  EMACS  editor  can  use  EMACS  and  hhnl-helper-mode  ,  an  EMACS  "mode"  for  HTML 
editing  (URL  is  http://www.reed.edu/~nelson/tools/). 

There  is  also  another  Emacs  HTML  mode,  html-mode.el  (URL  is 
ftp;//ftp.ncsa.uiuc.edu/Web/elisp/html-mode.el). 

For  Microsoft  Windows  users,  there  is  an  editor  called  HTML  Assistant  with  features  to  assist  in  the 

creation  of  HTML  documents.  It  can  be  had  by  anonymous  FTP  frpm  ftp.es.  dal .  ca  in  the  directory 

/htmlasst/.  Read  the  README.IST  file  in  this  directory  for  information  on  which  files  to 

download. 

A  WYSIWYG  editor  for  the  Web,  *SoftQuad  HoTMetaL*,  is  available  for  downloading  at  NCSA  and 


other  Mosaic  server  sites.  Many  mirror  sites  exist;  if  you  can't  get  through  to  one,  try  another,  don't 
give  up!  That's  what  mirror  sites  are  for.  (Also  be  sure  to  use  the  copy  closest  to  you  geographically  if 
possible.) 

Known  mirrors: 

9  ftp://ftp.ncsa.uiuc.edU/Mosaic/contrib/SoftOuad/sqhotmetal-l.0.tar.gz 

o  ftp:  /  /ftp.ifi.uio.no  /pub  /SGML/HoTMetaL 

®  ftp:  /  /sgml  1  .ex.ac.uk  /SoftOuad 

«  ftp://doc.ic.ac.uk/pub/packages/WWW/ncsa/contrib/SoftOuad 

9  ftp:  /  /askhp.ask.uni-karlsruhe.de  /pub  /infosystems  /mosaic/contrib  /SoftOuad 

»  ftp://ftp.cs.concordia.ca/pub/v^^ww 

You  need  a  Sun  SPARC  or  Microsoft  Windows  system  and  6MB  of  disk  (6MB  of  RAM  minimum  for  MS 
Windows).  Because  it  is  context-sensitive,  HoTMetaL  guides  users  in  creating  new  HTML  documents 
and  in  cleaning  up  old  ones.  A  Publish  command  changes  appropriate  SRC  and  HREF  attributes  from 
local  paths  to  http  locations.  For  more  information,  FTP  the  README  file  from  the  same  directory,  or 
send  email  to  hotmetal@sq.com.  A  HoTMetaL  Pro  commercially  supported  version  is  available  for 
purchase  from  SoftQuad  and  its  resellers. 

An  editor  for  all  X  users:  TkWWW  (listed  above  under  X  browsers)  supports  WYSIWYG  HTML 
editing;  and  since  it's  a  browser,  you  can  try  out  links  immediately  after  creating  them. 

Also  for  X  users,  there  is  a  package  called  htmltext  which  supports  WYSIWYG  HTML  editing.  More 
information  is  available  at  the  URL  http://web.cs. city. ac.uk /homes /njw/htmltext /htmltext. html. 

For  Macintosh  users,  there  is  evidently  a  near-WYSIWYG  package  called  HTML  Editor  (URL  is 
http: //dragon.acadiau.ca:1667/~giles /HTML  Editor). 

Also  for  Macintosh  users,  the  BBEdit  HTML  extensions  allow  the  BBEdit  and  BBEdit  Lite  text  editors 
for  the  Macintosh  to  conveniently  edit  HTML  documents.  (URL  is 

http://www.uji.es/bbedit-html-extensions.html.)  You  can  also  obtain  the  extensions  package  by 
anonymous  ftp  from  sumex-aim.stanford.edu  as  info-mac /bbedit-html-ext-b3.hqx. 

There  is  an  alternative  BBEdit  extension  package  available  as  well  (;URL  is 

http:  / /w/m.vork.  ac.uk/~ldll/BBEditTools.html).  it  is  available  by  FTP  from  ftp.york.ac.uk 

in  the  directory  /pub/users/ldll/BBEdit_HTML_Tools  .  sea.hqx. 

NCSA's  List  of  Filters  and  Editors,  for  which  the  URL  is 

http://www.ncsa.uiuc.edU/SDG/Software/Mosaic/Docs/faq-software.html#editors,  mentions 
several  editors,  including  two  for  MS  Windows.  In  some  cases,  the  "editor"  amounts  to  a  set  of  macros 
for  an  existing  word  processor,  which  can  provide  a  near-WYSIWYG  environment. 

Note  that  this  URL  contains  uppercase  and  lowercase  letters;  certain  operating  systems  won't  allow 
mixed  case  on  the  command  line,  or  will  only  allow  it  if  it  is  quoted  (VMS),  so  if  you  are  launching 
Lynx  or  another  client  and  specifying  a  URL  at  the  command  line,  try  quoting  the  URL  in  double-quotes 
("URL"). 

Another  option,  if  you  have  an  SGML  editor,  is  to  use  it  with  the  HTML  DTD  . 

(Up  to  Table  of  Contents) 

5.3.3:  Converting  other  formats  to  HTML 

There  is  a  collection  of  filters  for  converting  your  existing  documents  (in  TeX  and  other  non-HTML 
formats)  into  HTML  automatically,  including  filters  that  can  allow  more  or  less  WYSIWYG  editing 
using  various  word  processors: 


•Rirh  Brandwpin  and  Mike  Smdall's  List  at  CERN.  The  URL  is 
http://info.cern.ch/hy,pertext/WWW/Tools/Filters.html. 

(Note  that  this  URL  contains  uppercase  and  lowercase  letters;  certain  operating  systems  such  as  VMS 
require  you  to  quote  mixed-case  URLs  when  launching  a  borwser  from  the  command  line.  This  is  NOT  a 
bug  in  the  browser.) 

There  is  also  a  Word  for  Windows  template  for  writing  HTML  documents,  available  at  the  URL 

http:  //w';^w■aabech■edu/v;ord  html /release  .  htm. 

(IJji  to  Tahk  of  Contents) 

5.4:  How  do  I  publicize  my  work? 

There  are  several  things  you  can  do  to  publicize  your  new  HTML  server  or  other  offering: 

9    Submit  it  to  the  NCSA  What's  New  Page  at  the  URL 

http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/whats-new.html  (see  the  page  for 

details  on  how  to  submit  your  listing!). 
9    Post  it  to  the  newsgroup  comp.infosystems.announce.  Please  read  the  group  first  to  get  a  feel  for 
the  contents.  You  should  not  post  to  comp.infosystems.www.users,.misc,.providers,  etc.,  but  if 
you  feel  compelled  to  do  so,  please  choose  .misc  as  announcements  are  of  interest  to  both 
providers  and  users  (and  those  who  wear  both  hats). 

«>    Submit  it  to  the  maintainers  of  various  catalogs,  such  as  the  WV'AV  ViitUpl  Library  (at  the 
URL  http://info.cern.ch/hypertext/DataSources/bySubject/Overview.html)  and  the 
ALIWEB  index  (at  the  URL  http://web.nexor.co.uk/aliweb/doc/aliweb.html). 

(Up  to  Table  of  Contents) 

5.5:  Can  I  buy  space  on  an  existing  server? 

Yes,  you  can.  A  list  of  sites  offering  WWW  space  for  lease  is  available  (at  the  URL 

http : / /union . ncsa . uiuc . edu /www/ leas ina . html  ). 

(Up  to  Table  of  Contents) 

5.6.1:  How  do  I  set  up  a  clickable  image  map? 

There  are  really  two  issues  here:  how  to  indicate  in  HTML  that  you  want  an  image  to  be  clickable,  and 
how  to  configure  your  server  to  do  something  with  the  clicks  returned  by  Mosaic,  Chimera,  and  other 
clients  capable  of  delivering  them. 

You  can  read  about  image  maps  and  the  NCSA  server  at  the  URL 
http://honhoo.ncsa.uiuc.edU/doc:s/setup/admin/Imagemap.html. 

Using  imagemaps  requires  that  you  create  a  map  file;  you  can  do  this  by  hand  or  with  a  WYSIWYG 
tool.  I  wrote  Mapedit  (URL  is:  http  :  / /sunsite  .unc  .  edu/boutell/mapedit/mapedit  .html    ), 
which  is  such  a  tool  for  Microsoft  Windows  and  the  X  Window  System.  Other  tools  are  available. 
(URLs,  anyone?) 

Important  Note:  Creating  imagemaps  requires  a  cooperative  server  administrator  and  a  real  web 
server.  Don't  waste  time  making  maps  before  making  sure  you  have  the  necessary  tools  to  deliver 
them.  -^ 

(Up  to  TaUe  of  Contents) 

5.6.2:  How  do  I  make  a  "link'  that  doesn't  load  a  new  page? 


Such  links  are  useful  when  a  form  is  in/fended  to  perform  some  action  on  the  server  machine  without  ■ 
sending  new  information  to  the  client,  or  when  a  user  has  clicked  in  an  undefined  area  in  an  image 
map;  these  are  just  two  possibilities. 

Rob  McCool  of  NCSA  provided  the  following  wisdom  on  the  subject: 

Yechezkal-Shimon  Gutfreund  (sg04@gte.com)  wrote: 
:  Ok,  here  is  another  bizzare  request  from  me; 

I  am  currently  running  scripts  which  I  "DO  NOT"  want  to  return 
any  visible  result.  That  is,  not  text/plain,  not  text/HTML,  not 
image/gif.  The  entire  results  are  the  side  effects  of  the 
script  and  nothing  should  be  returned  to  the  viewer. 

;  It  would  be  nice  to  have  an  internally  supported  null  viewer 

:  so  that  I  could  do  this,  more  "cleanly"  (ok,  ok,  I  hear  your  groans). 

HTTP  now  supports  a  response  code  of  204,  which  is  no  operation.  Some  browsers  such  as  Mosaic/X  1* 
support  it.  To  use  it,  make  your  script  a  nph  script  and  output  an  HTTP/l.O  204  header.  Something  like: 
HTTP/ 1.0    2  04    No   response    Server:    Myscript/NCSA  httpd    1.1 

(You  can  learn  more  about  nph  scripts  from  the  NCSA  server  documentation  at  the  URL 
http://hoohoo.ncsa.uiuc.edu/docs.)  Essentially  they  are  scripts  that  handle  their  own  HTTP  response 
codes. 

(Up  to  Table  of  Contents) 

5.6.3:  Where  can  I  learn  how  to  create  fill-out  forms? 

You  can  read  about  the  Common  Gateway  Interface  (at  the  URL  http://hoohoo.ncsa. uiuc.edu:80/cgi/). 
In  addition  to  documenting  the  standard  interface  for  which  scripts  can  now  be  written  for  both  NCSA 
and  CERN-derived  servers,  these  pages  also  cover  HTML  forms  and  how  to  handle  the  results  on  the 
server  side.  See  the  section  on  email  forms  for  a  simple  solution  to  the  most  commonly  desired  form. 

(Uf  to  Table  of  Contents) 

5.6.3.1:  How  can  I  create  hidden  fields  in  forms  (keeping  state)? 

Use  INPUT  TYPE=hidden.  An  example: 

<INPUT  TYPE=hidden  NAME=state  VALUE= "hidden  info  to  be  returned  with  form"> 

By  now,  most  if  not  all  browsers  can  handle  the  hidden  type.  Note  that  "hidden"  doesn't  mean 
"secret";  the  user  can  always  click  on  "view  source". 

(Up  to  Table  of  Contents) 

5.6.3.2:  How  can  users  send  me  email  through  their  browsers? 

If  you  have  access  to  the  server's  configuration  files,  or  if  your  server  administrator  permits  users  to 
create  their  own  CGI  scripts,  you  can  arrange  it.  I've  written  a  simple  email  forms  package  (URL  is: 
http://siva.cshl.org/email/index.html),  which  does  it  in  ANSI  C.  There  is  also  a  package  floating 
around  in  Perl  (URL,  anyone?). 

(Up  to  Table  of  Co7itents) 

5.6.4:  How  do  I  comment  an  HTML  document? 

Use  the  <!--  tag  at  the  beginning  of  EACH  line  commented  out;  close  this  for  EACH  line  with  the  --> 


tag.  Note  that  comments  do  not  nest,  and  the  sequence  "--"  may  not  appear  inside  a  comment  except  as 
part  of  the  closing  -->  tag.  ''  ;     ., 

You  should  not  try  to  use  this  to  "comment  out"  HTML  that  would  otherwise  be  shown  to  the  user,  since 
some  browsers  (notably  Mosaic)  will  still  pay  attention  to  tags  inside  the  comment  and  close  it 
prematurely. 

Thanks  to  Joe  English  for  clearing  up  this  issue. 

(Up  to  Table  of  Contents) 

5.6.5:  How  can  I  create  decent-looking  tables  and  stop  using  <PRE>...  </PRE>? 

Tables  are  a  standard  feature  in  HTML  Level  3.  a  new  version  of  HTML.  Unfortunately,  they  are  at 
present  implemented  only  by  the  Viola  and  Emacs-W3  browsers,  to  my  knowledge. 

Howeuer,  there  is  a  way  to  use  HTML  Level  3  tables  now  and  convert  them  automatically  to  HTML, 
allowing  you  to  design  proper  tables  and  install  those  pages  directly  when  table  support  arrives  in  the 
majority  of  clients.  You  can  do  this  using  the  html+tables  package,  by  Brooks  Cutter 
(bcutter@paradyne.com),  which  is  available  for  anonymous  ftp  from  sunsite.unc.edu  in  the  directory 
pub/packages/infosystems/WWW/tools/html+tables.shar.  This  package  requires  the  shell  language 
Perl,  which  is  primarily  used  on  Unix  systems  but  is  also  available  for  other  systems  (such  as  MSDOS 
machines),  html+tables  accepts  HTML  Level  3  and  outputs  html  using  the  <PRE>...</PRE>  construct  to 
represent  tables,  allowing  you  to  write  HTML  Level  3  now,  knowing  that  it  will  look  better  when 
clients  are  ready  for  it. 

(Up  to  Table  of  Contents) 

5.6.6:  What  is  HTML  Level  3  and  where  can  I  learn  more  about  it? 

HTML  Level  3,  also  known  as  HTML+,  is  an  enhanced  version  of  HTML  designed  to  address  some  of  the 
limitations  of  HTML.  HTML  Level  3  supports  true  tables,  right-justified  text,  centered  text,  line  breaks 
that  do  not  double  space,  and  many  other  desired  features. 

However,  most  clients  support  only  a  handful  of  HTML  Level  3  features  (such  as  forms  in  Mosaic)  at 
this  time. 

You  can  access  information  about  new  developments  in  HTML  at  the  CERN  server  .(at  the  URL 
hfctp  :  /  /  i.nf  o  .  cern  .  ch/hvpertext/t'JW'';/Markun/Markup  ■  html    ). 

(HTML  Level  1  is  the  original  version.  HTML  Level  2  is  essentially  the  same,  but  with  the  addition  of 
forms.) 

(Up  to  Table  of  Contents) 

5.6.7:  How  can  I  make  transparent  GIFs? 

Transparent  GIFs  are  useful  because  they  appear  to  blend  in  smoothly  with  the  user's  display,  even  if 
the  user  has  set  a  background  color  that  differs  from  that  the  developer  expected. 

There  is  a  document  explaining  transparent  GIFs  available  at  the  URL 

http:  /  /melmac.corp.harris.com/transparent  images.html.  You  can  fetch  the  program  giftrans  by 

anonymous  ftp  from  ftp.rz.uni-karlsruhe.de  at  the  path  /pub/net/www/tools/giftrans.c. 

There  is  also  a  utility  for  the  Macintosh,  Transparency  (URL  is: 

htt-p:  //www. med.cornell.edu/--ailes /projects. html jrtransparency). 

(Up  to  Table  of  Contents) 

5.6.8:  How  come  mailto:  URLs  don't  work? 


The  mailto  :  URL  is  an  innovation  fouAd  in  Lynj^  and  a  few  other  browsers.  It  is  not  yet  found  in 
Mosaic,  the  most  popular  browser.  Hopefully  it  will  be  present  in  future  versions.  In  the  meantime,  you 
can  set  up  forms  which  send  mail  to  you;  there  is  documentation  on  this  at  the  URL 
http.V/siva.cshl.org /email /index.html. 

(Up  to  Table  of  Contents) 

5.6.9:  How  can  I  restrict  and  control  access  to  my  server? 

All  major  servers  have  features  that  allow  you  to  limit  access  to  particular  sites,  and  many  clients 
have  authentication  features  that  allow  you  to  identify  specific  users.  There  is  a  tutorial  on  security 
and  user  authentication  with  the  NCSA  server  and  Mosaic  available,  written  by  Marc  Andreessen 
(URL  is  http://wintermute.ncsa.uiuc.edu:8080/auth-tutorial/tutorial.html).  See  your  server 
documentation  for  further  information. 

(Up  to  Table  of  Contents) 

5.6.10:  Which  format  is  better  for  WWW  image  purposes,  JPEG  or  GIF? 

JPEG  does  a  better  job  with  realistic  images  such  as  scanned  photographs.  Most  browsers  cannot  handle 
inline  JPEGs,  however,  so  you  must  link  to  them  as  external  images  (using  a  regular  &ltA  HREF...> 
instead  of  <IMG  SRC...>. 

GIF  does  a  better  job  with  crisp,  sharp  images,  such  as  those  typically  used  to  construct  buttons,  graphs 
and  the  like.  All  browsers  that  can  display  graphics  at  all  can  display  GIFs  inline. 

(Up  to  Table  of  Contents) 

5.6.11:  How  can  I  mirror  part  of  another  server? 

Scripts  are  available  to  do  this,  but  at  this  time  they  are  not  very  friendly  to  the  server  you  are 
attempting  to  mirror;  their  behavior  resembles  that  of  the  more  poorly  written  WWW  robots.  If  you 
are  trying  to  improve  access  times  to  a  distant  server,  you  will  likely  find  the  "proxy"  capabilities  of 
CERN's  WWW  server  to  be  a  more  effective  and  general  solution  to  your  problem. 

(Up  to  Table  of  Contents) 

5.6.12:  How  can  I  keep  robots  off  my  server? 

Programs  that  automatically  traverse  the  web  can  be  quite  useful,  but  have  the  potential  to  make  a 
serious  mess  of  things.  Every  so  often  someone  will  write  a  "depth-first"  searching  robot  that  brings 
servers  to  their  knees.  See  the  section  on  writing  robots  (4.10')  for  details. 

Fortunately,  most  robots  on  the  web  follow  a  simple  protocol  by  which  you  can  keep  them  off  your 
server  if  you  wish,  or  keep  them  out  of  portions  of  your  server  which  are  robot  traps  (ie,  they  contain 
an  infinite  number  of  possible  links).  Read  the  document  World  Wide  Web  Robots,  Wanderers  and 
Spiders  (URL  is:  http://web.nexor.co.uk/mak/doc/robots/robots.html)  and  learn  about  the  emerging 
standards  for  exclusion  of  robots  from  areas  in  which  they  are  not  wanted.  You  can  also  read  about 
existing  robots  there,  including  useful  cataloging  robots  you  probably  do  not  want  to  keep  off  your 
server. 

(Up  to  Table  of  Contents) 

6:  What  newsgroups  discuss  the  Web?  ^ 

You  can  find  discussion  of  World  Wide  Web  topics  in  three  newsgroups,  and  one  newsgroup  which  will 
soon  be  removed; 

comp.infosystems.www.users 


A  forum  for  the  discussion  of  WWW  client  software  and  its  use  in  contacting  various  Internet 
information  sources.  New  user  questions,  client  setup  questions,  client  bug  reports, 
resource-discovery  questions  on  how  to  locate  information  on  the  web  that  can't  be  found  by  the 
means  detailed  in  the  FAQ  and  comparison  between  various  client  packages  are  among  the 
acceptable  topics  for  this  group.  Please  specify  what  browser  and  what  system  type  (Windows, 
Mac,  Unix,  etc.)  your  post  is  about  if  you  are  asking  questions  about  a  specific  program. 

romp.infosystems.wv^rw.providers 

A  forum  for  the  discussion  of  WWW  server  software  and  the  use  of  said  software  to  present 
information  to  users.  General  server  design,  setup  questions,  server  bug  reports,  security  issues, 
HTML  page  design  and  other  concerns  of  information  providers  are  among  the  likely  topics  for 
this  group. 

comp.infosystems.www.misc 

A  forum  for  general  discussion  of  WWW  (World  Wide  Web)-  related  topics  that  are  NOT 
covered  by  the  other  newsgroups  in  the  hierarchy.  This  will  likely  include  discussions  of  the 
Web's  future,  politicking  regarding  changes  in  the  structure  and  protocols  of  the  web  that  affect 
both  clients  and  servers,  et  cetera. 

comp.infosystems.www  (DEFUNCT) 

The  old  catch-all  newsgroup,  which  may  still  exist  on  your  system  but  will  be  removed  on 
September  7th,  according  to  David  Tale,  moderator  of  news. announce. newgroups. 

(Up  to  Table  of  Contents) 

7: 1  want  to  know  more 

To  find  out  more,  use  the  web.  This  FAQ  hopefully  provides  enough  information  for  you  to  locate  and 
install  a  browser  on  your  system.  If  you  have  system  specific  questions  regarding  FTP,  networking  and 
the  like,  please  consult  newsgroups  relevant  to  your  particular  hardware  and  operating  system! 

Once  you're  up  and  running,  you  may  wish  to  consult  the  World  Wide  Web  Primer  by  Nathan 

Torkington.  It  is  available  at  the  URL 
http://www.vuw.ac.nz/who/Nathan.Torkington/ideas/www-primer.html. 

Later  you  may  return  to  this  FAQ  for  answers  to  some  of  the  more  advanced  questions.  I  encourage  you  to 
check  out  the  changes  listed  early  in  the  document  each  time  the  FAQ  appears. 

(Up  to  Table  of  Contents) 

8:  Credits 

«  Thomas  Boutell  boutell@netcom.com 

9  Nathan  Torkington  Nathan.Torkington@vuw.ac.nz 

9  Marc- Andreessen  marca@ncsa.uiuc.edu 

9  Tony  Johnson 

(Up  to  Table  of  Contents) 


Tim  Berriers-Lee,  Robert  Cailliau,  Ari  Luotonen,  Henrik  Frystyk  Nielsen,  and  Arthur  Secret 

The  World-Wide  Web 


The  World-Wide  Web  (W3)  was 
developed  to  be  a  pool  of  human 
knowledge,  which  would  allow  col- 
laborators in  remote  sites  to  share 
their  ideas  and  all  aspects  of  a  com- 
mon project.  Physicists  and  engi- 
neers at  CERN,  the  European 
Particle  Physics  Laboratory  in 
Geneva,  Switzerland,  collaborate 
with  many  other  institutes  to  build 
the  software  and  hardware  for  high- 
energ)'  physics  research.  The  idea  of 
the  Web  was  prompted  by  positive 
experience  of  a  small  "home-brew" 
personal  hypertext  .system  used  for 
keeping  track  of  personal  informa- 
tion on  a  distributed  project.  The 
Web  was  designed  so  that  if  it  was 
used  independenth'  for  two  proj- 
ects, and  later  relationships  were 
found  between  the  projects,  then 
no  major  or  centralized  changes 
would  have  to  be  made,  but  the 
information  could  smoothly  re- 
shape to  represent  the  new  state  of 
knowledge.  This  property  of  scaling 
has  allowed  the  Web  to  expand 
rapidly  from  its  origins  at  CERN 
across  the  Internet  irrespective  of 
boundaries  of  nations  or  disciplines. 

If  you  haven't  yet  experienced  the 
Web,  the  best  way  to  fmd  out  about  it 
is  to  try  it.  An  Appendix  to  this  article 
gives  some  recipes  for  getting  hold  of 
W3  clients.  Ciiven  one  of  these,  you 
will  quickly  find  out  all  you  need  to 
know,  and  much  more.  For  hard 
copy  to  read  on  the  plane,  or  if  you 
don't  have  Internet  access  from  your 
desktop  machine,  refer  to  our  paper 
in  Electronic  Networking  (see  "Glossary 
and  Further  Reading")  for  an  over- 
view of  the  project,  material  which  we 
will  not  repeat  but  will  summarize 
here. 

A  W3  "client"  program  runs  on 
your  computer.  When  it  starts,  it  dis- 
plays an  object,  normally  a  document 
with  text  and  possibly  images.  Some 
of  the  phrases  and  images  are  high- 
lighted: in  blue,  or  boxed,  or  perhaps 
numbered,  depending  on  what  sort 
of  a  display  you  have  and  how  your 
preferences  have  been  set.  Clicking 
the  mouse  on  the  highlighted  area 


("anchor")  causes  the  client  program 
to  retrieve  another  object  from  some 
other  computer,  a  "server."  The  re- 
trieved object  is  normalh'  also  in  a 
hypertext  format,  so  the  process  of 
navigation  continues  (see  Figiu'e  1 ). 
\Vhen  \'iewing  some  docinnents, 
the  reader  can  request  a  search,  by 
typing  in  plain  text  (or  complex  com- 
mands) to  send  to  the  server,  rather 
than  following  a  link.  In  either  case, 
the  client  sends  a  request  off  to  the 
.server,  often  a  completely  different 
machine  in  some  other  part  of  the 
world,  and  within  (typically)  a  sec- 
ond, the  related  information,  in  ei- 
ther hypertext,  plain  text  or  ]nultime- 
dia  format,  is  presented.  This  is  done 
repeatedly,  and  b\'  a  sequence  of  se- 
lections and  .searches  one  can  find 
anything  that  is  "out  there."  Some 
important  things  to  note  are: 

9  AVhatever  type  of  server,  the  user 
interface  is  the  same,  so  users  do  not 
need  to  understand  the  differences 
between  the  many  protocols  in  com- 
mon use.  Before  W3,  access  to  net- 
worked information  typically  in- 
volved knowledge  of  many  difTerent 
access  "recipes"  for  difl'erent  systems, 
and  a  diflerent  command  language 
for  each.  The  model  of  hypertext 
with  text  input  has  proved  sufficiently 
powerful  to  express  all  the  user  inter- 
faces, while  being  sufficiently  simple 
to  require  no  training  for  a  computer 
user. 

9  Links  can  point  to  anything  that 
can  be  displayed,  including  search 
result  lists.  (When  a  query  is  applied 
to  an  object,  the  resulting  object  has 
an  address,  defined  to  be  the  address 
of  the  queried  object  concatenated 
with  the  text  of  the  query.  As  the  re- 
sult object  has  an  address,  one  can 
make  links  to  it.  Following  the  link 
later  leads  to  a  reevaluation  of  the 
query.) 

®  While  menus  and  directories  are 
available,  the  extra  option  of  hyper- 
text provides  a  more  powerful  com- 
municadons  tool.  In  &imple  cases,  the 
server  program  can  generate  a  hy- 
pertext view  representing  (for  exam- 


ple) the  directory  structure  of  an  ex- 
isting file  store.  This  allows  existing 
data  to  be  piu  "on  the  Web"  without 
further  human  effort. 
9  There  is  a  ver\-  extendable  .system 
for  introducing  new  formats  for  mul- 
timedia data. 

9  There  are  many  \\'3  client  pro- 
grams. .As  hypertext  information  is 
transmitted  on  the  network  in  logical 
(mark-up)  form,  each  client  can  inter- 
pret this  in  a  way  natural  for  the 
given  platform,  making  optimal  use 
oi  fonts,  colors,  and  other  human  in- 
terface resources  a\ailable  on  that 
platform. 

What  Does  W3  Define? 

W3  has  come  to  stand  for  a  number 
of  things,  which  should  be  distin- 
guished. These  include 

9  The  idea  of  a  boimdiess  informa- 
tion world  in  which  all  items  have 
a  reference  by  which  they  can  be 
retrieved; 

9  The  address  system  (URl)  which 
the  project  implemented  to  make  this 
world  possible,  despite  man)-  differ- 
ent protocols; 

9  A  network  protocol  (HITP)  used 
by  native  W3  servers  giving  perfor- 
mance and  features  not  otherwise 
available; 

9  A  markup  language  (HTML)  which 
every  W3  client  is  required  to  under- 
stand, and  is  used  for  the  transmis- 
sion of  basic  things  such  as  text, 
menus  and  simple  on-line  help  infor- 
mation across  the  net; 
9  The  body  of  data  available  on  the 
Internet  using  all  or  some  of  the  pre- 
ceding listed  items. 

The  client-server  architecture  of  the 
Web  is  illustrated  in  Figure  2. 

Universal  Resource  identifiers 
Universal       Resource       Identifiers 
(URIs)  are  the  strings  used  as  ad- 

'The  Inleniel  Enginecrinjf  Insk  Kiirce  (lE'IK)  i.s 
currently  defining  a  .similar  and  derived  synlax 
known  as  a  Unilorm  Resonrcc  1-ocalor  (L'Rl.). 
•As  this  work  is  not  complete,  and  there  is  no 
guarantee  that  L'Rl.s  will  have  the  same  synlax 
or  properties  as  LRls.  we  use  the  term  L'Rl  here 
to  a\'oid  conlusion. 


TS    .VuRu.M  i!i!i4/Viil  ;i'.  .N'oHeoMMUHieaTioHSOBTHBaeM 


dresses  of  objects  (e.g.,  menus,  docu- 
ments, images)  on  the  Web.  For  ex- 
ample, the  URl  of  the  main  page  for 
the  \V\V\V  project  happens  to  be 

http://info.cern.ch/hypertext/ 
WWW/TheProject.html 

URIs  are  "Universal"  in  that  they 
encode  members  of  the  universal  set 
of  network  addresses.  For  a  new  net- 
work protocol  that  has  some  concept 
of  object,  one  can  form  an  address  for 
any  object  as  the  set  ol'  protocol  pa- 
rameters necessary  to  access  the  ob- 
ject. If  these  parameters  are  encoded 
into  a  concise  string,  with  a  prefix  to 
identify  the  protocol  and  encoding, 
one  has  a  new  URI  scheme.  There 
are  URIs  for  Internet  news  articles 
and  newsgroups  (the  NNTP  proto- 
col), and  for  FTP  archives,  for  telnet 
destinations,  email  addresses,  and  so 
on.  The  same  can  be  done  for  names 
of  objects  in  a  given  name  space. 

The  prefix  "http"  in  the  preceding 
example  indicates  the  address  space, 
and  defines  the  interpretation  of  the 
rest  of  the  string.  The  HTIT  protocol 
is  to  be  used,  so  the  string  contains 
the  address  of  the  server  to  be  con- 
tacted, and  a  substring  to  be  passed  to 
the  server.  Different  protocols  use 
different  syntaxes,  but  there  is  a  small 
amount  of  common  syntax.  For  ex- 
ample, the  common  URI  syntax  re- 
serves the  "/"  as  a  way  of  representing 
a  hierarchical  space,  and  "?"  as  a  sep- 
arator between  the  address  of  an  ob- 
ject and  a  query  operation  applied  to 
it.  As  these  forms  recur  in  several  in- 
formation systems,  to  allow  expres- 
sion of  them  in  the  common  syntax 
allows  the  features  to  be  retained  in 
the  common  model,  where  appropri- 
ate. Hierarchical  forms  are  useful  for 
hypertext,  where  one  "work"  may  be 
split  up  into  many  interlinked  docu- 
ments. Relative  names  exploit  the 
hierarchical  structure  and  allow  links 
to  be  made  within  the  work  indepen- 
dent of  the  higher  parts  of  the  URI 
such  as  the  server  name. 

URI  syntax  allows  objects  to  be 
addressed  not  only  using  HTTP,  but 
also  using  the  other  common  net- 
worked information  protocols  in  use 
today  (FFP,  NNTP,  Gopher,  and 
WAIS),  and  will  allow  extension  when 
new  protocols  are  developed. 

URIs  are  central  to  the  W3  archi- 


tecture. The  fact  that  it  is  easv  to  ad- 
dress an  object  anvwhere  on  the 
Internet  is  essential  i'or  the  SNStem  to 
scale,  and  for  the  information  space 
to  be  independent  of  the  network 
and  server  topologx. 

Hypertext  Transfer  Protocol 
Perhaps  misnamed,  rather  than 
being  a  protocol  for  transferring  hy- 
pertext, HTFP  is  a  protocol  for  trans- 
ferring information  with  the  effi- 
ciency necessary  for  making 
hypertext  jumps.  "Fhe  data  trans- 
ferred ma)'  be  plain  text,  hypertext, 
images,  or  anything  else. 

When  a  user  browses  the  Web,  ob- 
jects are  retrieved  in  rapid  succession 
from  often  widely  dispersed  servers. 
For  small  documents,  the  limitations 
to  the  response  time  stem  mainly 
from  the  number  of  round  trip  delays 
across  the  network  necessary  before 
the  rendition  of  the  object  can  be 
started.  HTTP  is  therefore  a  simple 
request/response  protocol. 

H'nT  does  not  only  transfer 
HTML  documents.  Although  HTML 
comprehension  is  required  of  W3  cli- 
ents, HTFP  is  used  for  retrieving 
documents  in  an  unbounded  and  ex- 
tensible set  of  formats.  To  achieve 
this,  the  client  sends  a  (weighted)  list 
of  the  formats  it  can  handle,  and  the 
server  replies  with  data  in  any  of 
those  formats  that  it  can  produce. 
This  allows  proprietary  formats  to  be 
used  between  consenting  programs 
in  private,  without  the  need  for  stan- 
dardization of  those  formats.  This  is 
important  both  for  high-end  users 
who  share  data  in  sophisticated 
forms,  and  also  as  a  hook  for  formats 
that  have  yet  to  be  invented.  The 
same  negotiation  system  is  used  for 
natural  language  (English,  French, 
for  example)  where  available,  as  well 
as  for  compression  forms. 

HTTP  is  an  Internet  protocol.  It  is 
similar  in  its  readable,  text-based  style 
to  the  File  Transfer  (FTP)  and  Net- 
work News  (NNTP)  Protocols  that 
have  been  used  to  transfer  files  and 
news  on  the  Internet  for  many  years. 
Unlike  these  protocols,  however, 
HTTP,  is  stateless.  (That  is,  it  runs 
over  a  TCP  connection  that  is  held 
only  for  the  durauon  of  one  opera- 
tion.) The  stateless  model  is  efficient 
when  a  link  from  one  object  ma)'  lead 
equally  well  to  an  object  stored  on  the 


same  sei-ver.  or  lo  another  distant 
server.  'Fhe  purpose  oi  a  reference 
such  as  a  L  Rl  is  that  it  should  always 
refer  to  the  "same"  (in  some  sense) 
object.  'Fhis  also  makes  a  stateless 
protocol  appropriate,  as  it  returns 
results  based  on  the  L'Rl  but  irrele- 
vant of  an)  previous  operations  per- 
formed b)  the  client. 

'Fhe  HTFP  request  fiom  the  client 
starts  with  an  operation  code  (known 
as  the  method,  in  conformance  with 
object-oriented  terminology)  and  the 
URI  of  the  object.  The  "GET" 
method  used  b\  all  browsers  is  de- 
fined to  be  idempotent  in  that  it 
should  preserve  the  state  of  the  Web 
(apart  fiom  billing  for  the  informa- 
tion transfer,  and  statistics).  A  "PUT" 
method  is  defined  for  front-end  up- 
date, and  a  "POST"  method  for  the 
attachment  of  a  new  document  to  the 
Web,  or  submission  of  a  filled-in  form 
or  other  object  to  some  processor. 
Use  of  PUT  and  POST  is  currendy 
limited,  parti)'  due  to  scarcit)'  of  hy- 
pertext editors.  'Fhe  extension  to 
other  methods  is  a  subject  of  stud) . 
When  objects  are  transferred  over 
the  network,  information  about  them 
("metainformation")  is  transferred  in 
HTFP  headers.  The  set  of  headers  is 
an  extension  of  the  Multipurpose 
Internet  Mail  Extensions  (MIME)  set. 
This  design  decision  was  taken  to 
open  the  door  to  integration  of  hy- 
permedia mail,  news,  and  informa- 
don  access.  Unlike  in  email,  transfer 
in  binary,  and  transfer  in  nonstan- 
dard but  niutuall)'  agreed  document 
formats  is  possible.  This  allows,  for 
example,  servers  to  indicate  links 
from,  and  titles  of,  documents  (such 
as  bit-map  images)  whose  data  format 
does  not  otherwise  include  such  in- 
formation. 

The  convention  that  unrecognized 
HTIT  headers  and  parameters  are 
ignored  has  made  it  easy  to  try  new- 
ideas  on  working  production  servers. 
This  has  allowed  the  protocol  defini- 
tion to  evolve  in  a  controlled  wa)'  b) 
the  incorporadon  of  tested  ideas. 

Hypertext  Markup  Language  (HTML) 
Despite  the  ability  of  HTTP  to  negoti- 
ate formats,  W3  needed  a  common 
basic  language  of  interchange  for 
hypertext.  HTML  is  that  language, 
and  much  of  the  fabric  of  the  Web  is 
constructed  out  of  it.  It  was  designed 


TO     .«LUBUS1  1994/Vul.;i;,  No,8  eOXMUHieaTISHB  Ol"  TMH  aSM 


^m 


^^fr 


VVurtiHWcleWeli  Unks 


to  be  sufficiently  simple  so  as  to  be 
easily  produced  by  both  people  and 
programs,  but  also  to  adhere  to  the 
SGML  standard  in  that  a  valid 
HTML  document,  if  attached  to 
SGML  declarations  including  the 
HTML  "DTD,"  may  be  parsed  by  an 
SGML  parser.  HTML  is  a  markup 
language  that  does  not  have  to  be 
used  with  HTTP.  It  can  be  used  in 
hypertext  email  (it  is  proposed  as  a 
format  for  MIME),  news,  and  any- 
where basic  hypertext  is  needed.  It 
includes  simple  structure  elements, 
such  as  several  levels  of  headings,  bul- 
leted  lists,  menus  and  compact  lists, 
all  of  which  are  useful  when  present- 
ing choices,  and  in  on-line  docu- 
ments. 

Under  development  is  a  much  en- 
riched version  of  HTML  known  has 
HTML-I-.  This  includes  features  for 
more  sophisticated  on-line  documen- 
tation, form  templates  for  the  entry  of 
data  by  users,  tables  and  mathemati- 
cal formulae.  Currently  many  brows- 


figure  1.  Using  the  World-Wide  Web.  Shown  here  Is  the  authors'  pro- 
totype World-Wide  Web  application  for  NextStep  machines.  The  appli- 
cation Initially  displays  the  user's  "home"  page  (top)  of  personal 
notes  and  links  (top).  Clicking  on  underlined  text  takes  the  reader  to 
new  documents.  In  this  case,  the  user  visited  the  Virtual  Library,  and, 
In  the  high  energy  physics  department,  found  a  link  to  CERN.  Linked 
to  CERN  was  the  "Atlas"  collaboration's  web  including  an  engineer- 
ing drawing  (background).  To  save  having  to  follow  the  same  path 
again,  the  link  menu  (shown)  allows  a  new  link  to  be  made,  for  exam  pie 
from  text  typed  into  the  home  page,  directly  to  the  Atlas  Information. 


ers  support  a  subset  of  the  HTML-f- 
features  in  addiuon  to  the  core 
HTML  set. 

HTML  is  defined  to  be  a  language 
of  communication,  which  actually 
flows  over  the  netwoi'k.  There  is  no 
requirement  that  files  are  stored  in 
HTML.  Servers  may  store  files  in 
other  formats,  or  in  variadons  on 
HTML  that  include  extra  informa- 
tion of  local  interest  only,  and  then 
generate  HTML  on  the  fly  with  each 
request. 


W3  and  Other  Systems 

Two  other  systems,  WAIS  (from 
Thinking  Machines  Corporation  and 
now  WAIS,  Inc.)  and  Gopher  (from 
the  University  of  Minnesota),  share 
W3's  client-server  architecture  and  a 
certain  amount  of  its  functionality. 
Table  1  indicates  some  of  the  differ- 
ences. 

The  WAIS  protocol  is  influenced 
largely  by  the  z39.50  protocol  de- 
signed for  networking  librar\'  cata- 
logs.  It  allows  a  text-based   search. 


eOMMUHieaTIOHS  OP  rHB  aeM  ..\ui;Ub(  I994/\ul.;i7.  .Nu.K 


?i 


Table  1.  Acorriparison  of  three  popular  network  information  projects. 

Registered  server  figures  taken  April  27, 1993  and  April  15, 1994.  WAIS:  from  Thinking  Machines  Corporation 
directory,  number  of  distinct  hosts.  Gopher:  from  "All  the  Gophers  in  the  world"  register  at  the  University 
of  Minnesota.  W3;  from  Geographical  registry  at  cerinj.  In  all  cases  many  more  servers  exist  which  are  not 
directly  registered,  so  these  are  a  very  rough  guide  with  no  Indication  of  quantity  or  quality  of 
information  at  each  host. 


'V'-.   , 

,     WAIS 

' "    .  '           Gopher 

WorW-Wide 

•'      -,;,  ,„;  Web      ,, 

Original  target 
application 

Text-based 

information 

retrieval 

Campus-wide 

information 

(CWIS) 

Collaborative 
work 

Typical  objects 

Text 

Menus,  Graphics 

Hypertext 

YES 
NO 
NO 

YES 
YES 
NO 

YES 
YES 
YES 

Search  functions 

Text  search 
Relevance  feedback 
Reference  to  other 
servers 

YES 
YES 
NO 

YES 
NO 
YES 

YES 
NO 

YES 

Registered  servers 

April  1993 
April  1994 

113 
137 

455 
1410 

62 
829 

and  retrieval  following  a  search.  In- 
dexes to  be  searched  are  found  by 
searching  in  a  master  index.  This 
two-stage  search  has  been  demon- 
strated to  be  sufficiently  powerful  to 
cover  the  current  world  of  WAIS 
data.  There  are  no  navigational  tools 
to  allow  the  reader  to  be  shown  the 
available  resources,  however,  or 
guided  through  the  data:  the  reader 
is  "parachuted  in"  to  a  hopefully  rele- 
vant spot  in  the  information  world, 
but  left  without  context. 

Gopher  provides  a  free  text  search 
mechanism,  but  principally  uses 
menus.  A  menu  is  a  list  of  ddes,  from 
which  the  user  may  pick  one.  While 
gopher  space  is  in  fact  a  web  contain- 
ing many  loops,  the  menu  system 
gives  the  user  the  impression  of  a 
tree.  The  Veronica  server  provides  a 
master  index  for  gopher  space. 

The  W3  data  model  is  similar  to 
the  gopher  model,  except  that  menus 
are  generalized  to  hypertext  docu- 
ments. In  both  cases,  simple  file  serv- 
ers generate  the  menus  or  hypertext 
directly  from  the  file  structure  of  a 
server.  The  W3  hypertext  model 
gives  the  program  more  power  to 
communicate  the  opdons  available  to 
the  reader,  as  it  can  include  headings 
and  various  forms  of  list  structure,  for 
example,  within  the  hypertext. 


All  three  systems  allow  for  the  pro- 
vision of  graphics,  sound  and  video, 
although  because  the  WAIS  system 
only  has  access  by  text  search,  text  has 
to  be  associated  with  graphics  files  to 
allow  them  to  be  found. 

W3  clients  provide  access  to  servers 
of  all  types,  as  a  single  simple  inter- 
face to  the  whole  Web  is  considered 
very  important.  Unknown  to  the 
user,  several  protocols  are  in  use  be- 
hind the  scenes.  A  common  code  li- 
brary "libwww"  put  into  the  public 
domain  by  CERN  has  promoted  this 
uniformity.  Whereas  one  would  not 
wish  to  see  greater  proliferation  of 
protocols,  the  existence  of  more  than 
one  protocol  probably  allows  for  the 
most  rapid  progress  during  this 
phase  in  the  development  of  the  field. 
It  also  allows  a  certain  limited  confi- 
dence that,  if  an  architecture  can  en- 
compass older  systems  and  allow 
transition  to  current  systems,  it  will, 
by  induction,  be  able  to  provide  a 
transition  to  newer  and  better  ideas 
as  they  are  invented. 

Recent  W3  Developments 

This  article,  like  others  in  this  issue, 
was  derived  from  material  written  in 
April  1993  for  the  I!>!ET'93  confer- 
ence. Growth  of  the  Web  since  that 
time  has  been  so  great  that  this  sec- 


tion has  been  completely  rewritten. 
There  are  now  829  (Mav:  1,248) 
rather  than  62  registered  HTTP  serv- 
ers, and  man)'  more  client  programs 
available  as  then. 

The  initial  prototype  \V3  client  was 
a  "Wysiwyg"  hypertext  browser/editor 
using  NeXTStep.  We  dex'eloped  a 
line  mode  browser,  and  were  encour- 
aging the  developments  of  a  good 
browser  for  X  workstations.  One  year 
ago,  NCSA's  Mosaic  VV3  browser  was 
in  wide  use  on  X  workstations.  Its 
easy  installadon  and  use  was  a  major 
reason  for  the  spread  of  the  Web. 
Today  there  are  many  browsers  avail- 
able for  workstations,  Macintosh  and 
IBM/PC  compatible  machines,  and 
for  users  with  character-based  termi- 
nals. Of  the  latter  category,  "Lynx" 
from  the  University  of  Kansas  pro- 
vides full-screen  access  to  the  Web  for 
users  with  character  terminals  or 
emulators  running  on  personal  com- 
puters. Since  new  software  is  appear- 
ing frequendy,  readers  are  advised  to 
check  the  lists  on  the  Web  for  those 
most  suited  to  their  needs. 

The  availability  of  browsers  and 
the  availability  of  quality  information 
have  provoked  each  other.  One  avail- 
able indicator  of  growth  has  been 
Merit  Inc.'s  count  of  the  traffic  of  vai- 
ious   difl'erent   protocols   across    the 


80    Aufiusi  igH4/\'ul.:t7,  .No.BeoMMUMieaTioBBosTMaaeKi 


// 


NSF  T3  backbone  in  the  U.S.  (see 
Figure  3). 

An  indicator  of  tbe  uptake  rate  of 
clients  is  tiie  load  on  the  injo.icrn.ch. 
W3  server  at  CERN,  which  provides 
information  about  the  Web  itself, 
which  more  than  doubled  every  4 
months  over  the  three  years  between 
April  1991  and  April  1994. 

Information  pioviders  have  also 
blossomed.  Some  of  these  provide 
simple  overviews  of  what  is  available 
at  particular  institutes  or  in  particular 
fields.  Others  use  the  power  of  the 
W3  model  to  provide  a  virtual  world 
of  great  richness.  Examples  of  servers 
that  use  hypertext  in  interesting  ways 
are  the  R.'^iL-Durham  Particle  Data- 
base, and  the  Legal  Informadon  In- 
stitute's hypertexts  of  several  great 
tomes  of  American  law.  Franz 
Hoesel's  hypertext  version  of  the  Vat- 
ican's Renaissance  Culture  exhibit  at 
the  Library  of  Congress  set  an  exam- 
pie  that  was.followed  by  many  collec- 
tions of  art,  history  and  other  fields. 
The  Palo  Alto  town  hall  runs  a  server 
with  everything  from  building  regu- 
ladons  to  restaurants.  As  an  example 
of  the  increasing  use  of  the  Web  for 
commerce,  a  user-friendly  virtual 
clothing  store  prompts  for  one's  size, 
and  points  to  a  virtual  store  contain- 
ing only  those  clothes  that  are  the 
right  size  and  also  in  stock. 

The  Future 

The  W3  initiative  occupies  the  meet- 
ing point  of  iTiany  fields  of  technol- 
ogy. Users  put  pressure  and  efi'ort 
into  bringing  about  the  adopdon  of 
W3  in  new  areas.  Apart  from  being  a 
place  of  communication  and  learn- 
ing, and  a  new  market  place,  the  Web 
is  a  show  ground  for  new  develop- 
ments in  informadon  technology. 
Some  of  the  developments  that  we 
look  forward  to  in  the  next  few  years 
include 

9  The  implementadon  of  a  name  ser- 
vice that  will  allow  documents  to  be 
referenced  by  name,  independent  of 
their  location; 

®  Hypertext  editors  allowing  nonex- 
pert users  to  make  hypertext  links  to 
organize  published  informadon.  This 
will  bring  the  goal  of  computer- 
supported  collaboration  closer,  with 
front-end  update,  and  annotadon; 
8  More  sophisdcated  document  type 
definitions  providing  for  the  needs  of 


Terminal 
emulator 


PC  or 
Macintosh 


Unix 
X11 


NextStep 


ffl 


V'.'  •  K'Atltid^ssIng  scheme,  Protocols,  Format  Negotlaj 

V    '"""■" ■  "■ 

^^#  ^^9  ^^m  ^^9  ^^9 


Gateway  Is  HTTP 
server  plus  other 
application 

Database,  Info 
system,  etc. 


Pisure  2.  The  World-Wide  Web  client-server  architecture.  For  pub- 
lished Information  to  be  universally  available,  W5  relies  on  a  common 
addressing  syntax,  a  set  of  common  protocols,  and  negotiation  of 
data  formats. 


flfT^^^'T'^^^'P^f^^ .  ■,     ■"■;.;.,     .    \    ■■ 


10  Terabytes 


1  Terabyte 


100  Gigabytes 


10  Gigabytes 


1  Gigabyte 


100  Megabytes 


1 0  Megabytes 

9211   9301  9303  9305  9307  9309  9311  9401  9403 

Pigure  3.  Traffic  in  bytes  per  month  across  the  NSF  T3  baclcbone  in 
the  U.S.  File  Transfer  Protocol  (FTP)  was  traditionally  used  to  access 
archives  of  software.  FTP  uses  separate  connections  for  control  and 
data  flow.  WAIS  arose  as  an  interface  to  text  retrieval  systems,  Gopher 
protocol  with  menu-style  Interfaces,  and  W3's  HTTP  with  hypertext 
and  multimedia.  W3  clients  handle  many  protocols  to  access  all  these 
worlds  of  data  as  a  seamless  continuum,  but  new  W3  servers  use  HTTP 
by  preference.  Each  vertical  division  represents  a  tenfold  Increase  In 
traffic.  The  horizontal  divisions  are  months.  Data:  Merit  <  ftp://ftp. 
merit. edu/statistics/nsfnet  > 


.«,usU5l  l<J94/V.>l,:i7,  Nu.ll 


ei 


■'"'ii:'^-i\:-^.t~,>-  ^^.ii- ■■■■■ 

FTP-  File  Transfer  Protocol.  Postel,  J.  and  Reynolds,  j.  File  Transfer  Protocol. 
Internet  RFC  959,  October  1985.  <ftp://ds.lnteniio.net/rfo/rfo9S9.txt> 
copher:  The  internet  Gopher.  Anklesarla,  F.  et.  al.  The  internet  copher  Protocol, 
internet  RFC  1436,  March  1993.  <ftp://ds.lntemio.net/rfo/rfol436.txt> 
HTML:  Hypertext  Markup  Language.  Berners-Lee,  T.,  and  Connolly,  D.  Hypertext 
Markup  Language  Protocol.  <ftp://info.oern.oh/pub/www/cioo/htmi-speo.ps, 
.txt> 

HTTP:  Hypertext  Transfer  Protocol.  Berners-Lee,  T.  Hypertext  Transfer  Protocol. 
<ftp://liifo.cem.oh/pub/www/(loo/http-speo.ps,  .txt> 
MIME:  Multipurpose  Internet  Mall  Extensions.  Borenstein,  N.,  and  Freed,  N. 
MIME  (Multipurpose  internet  Mall  Extensions):  Mechanisms  for  specifying  and 
Describing  the  Format  of  internet  Message  Bodies,  internet  RFC  1341,  June 
1992. 

NNTP:  Network  News  Transfer  Protocol.  Kantor,  B.  and  Lapsley,  P.  A  proposed 
standard  for  the  transmission  of  news,  internet  RFC  977, 1986. 
URI:  universal  Resource  Identifier.  Bemers-Lee,  T.  Universal  Resource  Identifiers 
for  the  World-Wide  Web.  submitted  as  an  internet  RFC  as  yet  unnumbered.  See 
<http://lirfo.oeTO.oh/hypertext/WWW/Adclressing/AcldresBlng.htnil>  for  point- 
ers to  Information  on  this  area. 

WAlS:  Wide  Area  information  servers,  see  Addyman,  T.  WAIS:  strengths,  Weak- 
nesses and  opportunities,  in  Proceedings  of  Information  Networking  93  (Lon- 
don, May  1993),  Meckler,  London. 

W3:  Berners-Lee,  T.J.,  Caliliau,  R.,  croff,  J-F,  Poilermann,  B.  World-Wide  Web:  The 
Information  universe.  Electronic  Networking:  Research,  Applications  and  Policy. 
(Spring  1992),  52-58.  see  also  documents  in  <ftp://iiifo.oerii.oh/pub/www/doo> 
and  information  referenced  by  <http://info.oepn.oh/hypertext/www/ 
ThePrq)eot.html> 


:iw 


commercial  publishers  of  on-line 
material; 

9  The  development  of  a  common  for- 
mat for  hypertext  links  from  two-  and 
three-dimensional  images  giving 
more  exciting  interface  possibilities; 
9  Integration  with  concurrent  edi- 
tors and  other  real-time  features 
such  as  teleconferencing  and  virtual 
reality; 

9  Easy-to-use  servers  for  low-end 
machines  to  ease  publication  of 
information  by  small  groups  and 
individuals; 

9  Evolution  of  objects  from  being 
principally  human-readable  docu- 
ments to  contain  more  machine- 
oriented  semantic  information,  allow- 
ing more  sophisticated  processing; 
9  Conventions  on  the  Internet  for 
charging  and  commercial  use  to  allow 
direct  access  to  for-profit  services. 

Conclusion 

It  is  intended  that  after  reading  this 
article  you  will  have  an  idea  of  what 
\V3  is,  where  it  fits  in  with  other  sys- 
tems in  the  field,  and  where  it  is 
going.  There  is  much  more  to  be 
said,  especially  about  providing 
infonnation,    but    this    is    described 


on  the  Web  itself  Also  in  the  "Web 
about  the  Web"  are  lists  of  contrib- 
uted research  and  development  work 
and  ideas,  and  pointers  to  work  in 
progress,  so  that  those  interested  can 
work  together. 

The  Web  does  not  yet  meet  its  de- 
sign goal  as  being  a  pool  of  knowl- 
edge that  is  as  easy  to  update  as  to 
read.  That  level  of  immediacy  of 
knowledge  sharing  waits  for  easy-to- 
use  hypertext  editors  to  be  generally 
available  on  most  platforms.  Most  in- 
formation has  in  fact  passed  through 
publishers  or  system  managers  of  one 
sort  or  another.  However,  the  incred- 
ible diversity  of  information  available 
gives  great  credit  to  the  creativity  and 
ingenuity  of  information  providers, 
and  points  to  a  very  exciting  future. 

Getting  startBti 

If  you  have  a  vtl00  terminal,  you  can  try 
out  a  full-screen  interface  by  telnet  to 
ukanalx.cc.ukans.edu  and  logging  In  as 
www.  With  any  terminal,  you  can  telnet  to 
info.cern.ch  for  the  simplest  interface. 
These  browsers  are  also  Available  In  source 
and  in  some  cases  binary  form.  Details  of 
status  and  coordinates  of  about  20  differ- 


ent browsers  are  available  on  the  Web- 
just  follow  a  link  to  World-Wide  Web,  and 
select  "software  available." 

The  kernel  W3  code  (a  common  code  li- 
brary, and  basic  server  and  clients)  from 
CERN  is  m  the  public  domain.  (All  protocols 
and  specifications  are  public  domain.)  It  Is 
available  by  anonymous  FTP  from  In- 
fo.cern.ch 

NCSA's  "Mosaic"  browser  for  W3  is  avail- 
able for  X,  Mac  or  PC/Windows  by  anony- 
mous FTP  from  ftp.ncsa.uiuc.edu,  cur- 
rently without  charge  for  academic  users. 

About  the  Authors: 

TIM  BERNERS-LEE  originated  the 
World-Wide  Web  in  1990  to  enable  the 
sharing  of  knowledge  by  complex  distrib- 
uted teams.  At  CERN  he  coordinates  W3 
development  by  collaboradng  with  insti- 
tutes around  the  world.  Current  research 
interests  include  text  processing,  graphics, 
communications  software,  and  system  de- 
sign, email:  timbl(g,info.cern.ch 
ROBERT  CAILLIAU  coordinates  the  use 
of  W3  by  CER.N  experiments  and  other 
physics  institutes.  He  is  a  long-time  user  of 
HyperCard,  and  has  been  working  on  \V3 
since  1991,  contributing  many  ideas,  and 
some  software  for  the  Macintosh,  email: 
cailliauCg'Www.cern.ch 
ARI  LUOTONEN  is  a  member  of  CERN's 
technical  student  program  in  conjunction 
with  his  studies  at  Tampere  University  of 
Technology,  Finland.  Current  research 
interests  include  developing  CERN's 
"httpd"  HTTP  server  for  Unix  and  \'MS 
systems,  email:  luotonen(a  www.cern.ch 
HENRIK  FRYSTYK  NIELSEN,  of  .\al- 
borg  University,  Denmark,  is  also  a  CERN 
technical  student.  He  is  working  on  the 
kernel  code,  with  research  interests  in 
enhanced  networking  protocols,  email: 
frystyk(a  info.cern.ch 

ARTHUR  SECRET  wrote  the  first  gate- 
way giving  W3  access  to  a  relational  data- 
base in  1992,  while  studying  Computer 
Science  at  Ecole  Internationale  des  Sci- 
ences du  Traitement  de  I'Information  in 
Paris,  France,  as  a  CERN  technical  stu- 
dent. Among  other  tasks  in  the  CERN  W3 
team,  he  currently  organizes  the  catalog- 
ing of  new  W3  material  in  the  "virtual  li- 
brary." email:  secret®  info.cern.ch 

Authors'  Present  Address:  CERN,  121 1 
Ceneva  23,  Switzerland. 


Permission  to  copy  without  fee  all  or  part  of  tliis 
material  is  granted  pro\'ided  that  the  copies  are 
not  made  or  distributed  for  direct  commercial 
ad\'antage.  the  ACM  copyright  notice  and  the 
title  of  the  publication  and  its  date  appear,  and 
notice  is  given  that  copying  is  by  permission  of 
the  .Association  for  Ctmiputing  NIachinery.  To 
copy  otiierwise,  or  to  republish,  requires  a  fee 
and/or  specific  permission. 

©  .ACM  0002-0782/94/0800  S3. 50 


H^    .^ul;ust  i994/Vol.37.  N'o.8  eoi^MuHiei^TtoNS  OPTHB  aep^ 


